To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The statistical study of spatial patterns and processes has during the last few years provided a series of challenging problems for theories of statistical inference. Those challenges are the subject of this essay. As befits an essay, the results presented here are not in definitive form; indeed, many of the contributions raise as many questions as they answer. The essay is intended both for specialists in spatial statistics, who will discover much that has been achieved since the author's book (Spatial Statistics, Wiley, 1981), and for theoretical statisticians with an eye for problems arising in statistical practice.
This essay arose from the Adams Prize competition of the University of Cambridge, whose subject for 1985/6 was ‘Spatial and Geometrical Aspects of Probability and Statistics’. (It differs only slightly from the version which was awarded that prize.) The introductory chapter answers the question ‘what's so special about spatial statistics?’ The next three chapters elaborate on this by providing examples of new difficulties with likelihood inference in spatial Gaussian processes, the dominance of edge effects for the estimation of interaction in point processes. We show by example how Monte Carlo methods can make likelihood methods feasible in problems traditionally thought intractable.
The last two chapters deal with digital images. Here the problems are principally ones of scale dealing with up to a quarter of a million data points. Chapter 5 takes a very general Bayesian viewpoint and shows the importance of spatial models to encapsulate prior information about images.
Images as data are occurring increasingly frequently in a wide range of scientific disciplines. The scale of the images varies widely, from meteorological satellites which view scenes thousands of kilometres square and optical astronomy looking at sections of space, down to electron microscopy working at scales of 10µm or less. However, they all have in common a digital output of an image. With a few exceptions this is on a square grid, so each output measures the image within a small square known as a pixel. The measurement on each pixel can be a greylevel, typically one of 64 or 256 levels of luminance, or a series of greylevels representing luminance in different spectral bands. For example, earth resources satellites use luminance in the visual and infrared bands, typically four to seven numbers in total. One may of course use three bands to represent red, blue and green and so record an arbitrary colour on each pixel.
The resolution (the size of each pixel, hence the number of pixels per scene) is often limited by hardware considerations in the sensors. Optical astronomers now use 512 × 512 arrays of CCD (charge coupled device) sensors to replace photographic plates. The size of the pixels is limited by physical problems and also by the fact that these detectors count photons, so random events limit the practicable precision. In many other applications the limiting factor is digital communication speed. Digital images can be enormous in data-processing terms.
This essay aims to bring out some of the distinctive features and special problems of statistical inference on spatial processes. Realistic spatial stochastic processes are so far removed from the classical domain of statistical theory (sequences of independent, identically distributed observations) that they can provide a rather severe test of classical methods. Although much of the literature has been very negative about the problem, a few methods have emerged in this field which have spread to many other complex statistical problems. There is a sense in which spatial problems are currently the test bed for ideas in inference on complex stochastic systems.
Our definition of ‘spatial process’ is wide. It certainly includes all the areas of the author's monograph (Ripley, 1981), as well as more recent problems in image processing and analysis. Digital images are recorded as a set of observations (black/white, greylevel, colour…) on a square or hexagonal lattice. As such, they differ only in scale from other spatial phenomena which are sampled on a regular grid. Now the difference in scale is important, but it has become clear that it is fruitful to regard imaging problems from the viewpoint of spatial statistics, and this has been done quite extensively within the last five years.
Much of our consideration depends only on geometrical aspects of spatial patterns and processes.
This chapter will compare and contrast the text-generation method described in the previous chapters with other recent work in the field. This will not include the large body of research done recently on discourse planning, but only work concerned with realizing these plans. The most notable exclusion on these grounds is McKeown's TEXT (McKeown, 1982, 1983, 1985) which sets some discourse-related goals then does the actual text generation using unguided search and backtracking (see Appelt, 1983, p. 599).
A look at recent systems reveals that there are currently two general approaches to text generation: the “grammar-oriented” approach and the “goaloriented” approach. Both of these will be outlined, including their major practitioners, and the advantages that are offered.
Next, the systems that try to combine these two approaches will be considered. It will be shown that SLANG successfully achieves this, capturing the advantages of both the grammar-oriented and goal-oriented approaches.
The grammar-oriented approach
Several of the major text-generation projects are “grammar-oriented.” This term will be used to refer to those systems that traverse an explicit linguistic grammar. Since the flow of control is directed by the grammar traversal, the logical structure of the system reflects the structure of the grammar.
The original grammar-oriented systems would simply traverse the grammar, typically an ATN, backtracking where necessary. More recent grammar-oriented systems avoid backing up by doing an analysis at choice points to make sure the right decision is made the first time. This analysis often involves considering semantic and pragmatic issues that, for reasons of modularity, should not be directly accessible to the grammar.
This chapter will describe the first implementation of the Systemic Linguistic Approach to Natural-language Generation (SLANG-I). SLANG-I has been implemented as a production system using the production language OPS5. Since many OPS5 rules appear in this chapter, a short introduction to OPS5 has been provided in Appendix A.
This chapter is divided into five sections. The first is an overview of the system as a whole. It will provide high-level descriptions and explanations for the mechanisms used in the implementation. The second section is a detailed description of the System Network – OPS5 Rule Translator (SNORT) which outputs the grammar in the form of OPS5 production rules that can be used by SLANG-I. The text-generation system itself–SLANG-I–is described in detail in the third section. The fourth and fifth sections look at the limitations of this implementation and some possible alternatives respectively. Finally, a summary is given.
Overview
The purpose of this section is to provide a high-level overview of SNORT and SLANG-I before getting down to details in the next two sections. To a large extent this is made necessary by the interdependence between these two components. It is impossible to motivate the output of SNORT before explaining to a certain extent how SLANG-I works, and similarly SLANG-I cannot be explained before it is understood how the grammar is represented in OPS5 production rules. This section consists of four parts: first, a presentation of the abstract architecture of SLANG-I; second, a discussion of the OPS5 productions representing the systemic grammar; third, a description of the data structures used by SLANG-I; and fourth, a brief look at the control strategy used to coordinate the text-generation process.
This final chapter consists of four parts. First, the main points from the previous chapters will be summarized, giving a condensed description of the work done on the SLANG approach to text generation. Second, the problems that may impede progress on SLANG will be examined. Third, some ideas for future research will be explored. Fourth, the concluding remarks will include an evaluation of SLANG and the current progress, and the prospects for the future.
Summary
The problem
One problem that has persistently occupied and bedevilled text-generation research is how to interface higher-level reasoning with an explicit grammar written in an established linguistic formalism. This problem is central to text generation because of the computational and linguistic requirements of the task.
Text generation involves an enormous, complex search space, yet must be performed quickly if it is to be effective. These characteristics suggest that text generation requires the powerful knowledge-based computational methods–such as forward-chaining and goal-directed backward-chaining–developed in AI over the past fifteen years.
Text generation also has important linguistic requirements. Specifically, an explicit grammar that is represented in an established linguistic formalism is required. This enables direct input from linguists and the linguistic literature. It also allows the grammar to be understood, judged, modified and so on, independently of the computational concerns (Appelt, 1982). Finally, assuming that the processing is guided by the explicit grammar, the grammar can provide a useful display of the logical structure of the text-generation process.
The problem of interfacing the AI problem-solving techniques with the linguistic formalism arises because of the apparent incompatibility of the representations involved.
This chapter is intended to provide the background required to understand the AI aspects of the text-generation method presented in Chapter 4. This introduction will not be a comprehensive survey, but rather a primer to specific concepts and perspectives relevant to the generation method. First, the architecture of AI problem solving will be outlined. Second, the search-space model will be introduced and some search methods will be examined. Third, the idea of knowledge compilation will be explored, emphasizing the key role played by compiled knowledge in AI problem solving.
The architecture of AI problem solving
This discussion of the architecture of AI problem solving will have several limitations and biases–it will address AI problem solving as it is manifested in practical AI projects (especially expert systems), and it will be strongly biased toward the architecture and representations adopted in Chapter 6.
While the architecture of AI problem solving corresponds to some degree to the computer-science architecture–a program processing data–(see Brownston et al., 1985, p. 56), this correspondence is more likely to be misleading than helpful here. The architecture of AI problem solving involves three major components–the knowledge base, the inference engine, and the data memory or working memory. It must be assumed that any given problem solver will have been constructed to operate in some specific problem area or domain. Information about the classification of domain objects, the properties associated with these classes, the operations that can be performed within the domain, and other invariant domain-specific knowledge is stored in the knowledge base.
Any work on text generation must give an account of the linguistic theory–adopted or created–on which the generation process operates. This chapter is an introduction to the linguistic theory adopted here–systemic grammar. The linguistic representation plays a particularly important role in this work. Indeed, an understanding of many of the computational text-generation ideas requires an understanding of the underlying concepts in systemic theory.
This introduction to systemic grammar begins with a short history focused on the major contributors: Malinowski, Firth, Hjelmslev and Halliday. Then the goals or aims of systemic grammar are outlined. Some of the concepts from systemic theory which are most relevant to this work are then discussed in detail. Finally, descriptions of the stratification of systemic grammar, and in particular of the semantic stratum, are given.
History
Malinowski (1884–1942)
The origins of systemic linguistics clearly lie in the work of the anthropologist and ethnographer Bronislaw Malinowski (e.g. 1923). From Malinowski come two ideas that have had a profound influence on systemic theory. The first is the observation of the inseparability of language and its social and cultural context (Whorf must also be credited as an influence on this point–Kress, 1976, pp. ixx). Malinowski argued that language could only be viewed and explained with reference to the social and cultural milieu. It is important to note the sharp contrast between this starting point of systemic linguistics and the starting point of the structural/formal tradition: that language is a self-contained system (ibid., p. viii). Most importantly here, Malinowski provided the idea of “context of situation” –a description of the contextual factors influencing an utterance.
The research reported here was done within the Department of Artificial Intelligence at the University of Edinburgh. All the chapters but one are, with some modifications, chapters from my doctoral thesis. The exception (Chapter 5), is a slightly revised version of a paper written jointly with Graeme Ritchie, that was presented at the Third International Workshop on Natural Language Generation.
I would like to thank my thesis supervisor, Graeme Ritchie, for his patient and constructive criticism throughout the development of this work, and of course, for his direct contribution to Chapter 5. My other supervisor, Austin Tate, and the rest of the Edinburgh planning group provided insights into AI problem solving. I would also like to thank my thesis examiners, C. S. Mellish and Henry Thompson, for their helpful suggestions. I am also grateful to Mark Drummond, Andy Golding and Chris Sothcott for valuable technical discussions, to Mark Kingwell for proof-reading the thesis draft, and to Aravind Joshi as editor of the Cambridge University Press Studies in Natural Language Processing series.
This research was supported in part by Alberta and Canada Student Loans, and an Overseas Research Student Award. The word-processing and typesetting facilities used in the preparation of the final draft were kindly provided by the Department of Computer Science at the University of Calgary.
Despite the fact the systemic grammar has a relatively long history, and has been adopted in several computer implementations (Davey, 1978; Mann and Matthiessen, 1983), it has never been rigorously formalized in the way that traditional grammars have. The reason for this appears to be that the formal tools applied to traditional structural (syntagmatic) grammars are not so easily applied to a functional theory. In addition, it seems that the “rigorous rules” used to formalize traditional grammars are viewed by systemic linguists as inherently structural (e.g. Halliday, 1978, pp. 191–2). The formal model of systemic grammar presented here will involve “rigorous rules” but will not compromise the functional perspective. This formalization will allow the definition of such notions as the language generated by a grammar, and the demonstration of results relating to properties of a generation algorithm based on the previous chapter. The central issues discussed include the correctness and efficiency of this algorithm.
Two warnings should be given concerning this formal model. First, the generation algorithm presented (§5.3 and §5.4) is based on the description of systemic grammar presented in Chapter 3 which was in turn based partly on Halliday (1978). As a result of certain assumptions (especially concerning the input to the generation–in this case involving preselection), it may not be compatible with models based on other versions of the theory. Second, this formalization is largely exploratory, in that it is meant to investigate and illustrate the possibility of providing a rigorous formalization of systemic grammar suitable for defining a generation mechanism–it is not meant to be the definitive formalization of systemic grammar.