To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Data abstraction is based formally on the use of logical types to model data. To make effective use of data abstraction, it is generally necessary to define special-purpose types for both design models and specifications of required behaviour. The formal properties these types are required to have will depend on the kind of behaviour being specified, on the level of abstraction at which the devices are described, and on how accurate the specifications are intended to be.
This means that no fixed collection of logical types can be an adequate basis for specifying all devices. The types bool and num→bool, for example, are sufficient for specifying hardware behaviour at the level of abstraction where the devices used are flip-flops and combinational logic gates. But at the level of abstraction where the primitive components are transistors, an accurate model of behaviour has to account for more kinds of values than can be represented by the two truth-values T and F. It may be necessary to represent electrical signals of several different strengths, or to represent ‘undefined’ or ‘floating’ values. The types bool and num→bool are also insufficient for specifications at the register-transfer level of abstraction, where it is often necessary to specify behaviour not in terms of the values on single wires but in terms of vectors of bits and arithmetical operations on fixed-width binary words. At the architecture level, concise specifications may require comparatively complex abstract data types, such as stacks and queues.
This chapter provides an overview of the formulation of higher order logic used in this book for reasoning about hardware. A brief account is also given of the mechanization of this logic in the HOL theorem proving system.
The version of higher order logic described in this chapter was developed by Mike Gordon at the University of Cambridge [41]. Gordon's version of higher order logic is based on Church's formulation of simple type theory [24], which combines features of the λ-calculus with a simplification of the original type theory of Whitehead and Russell [115]. Gordon's machine-oriented formulation extends Church's theory in two significant ways: the syntax of types includes the polymorphic type discipline developed by Milner for the LCF logic PPλ [48], and the primitive basis of the logic includes rules of definition for extending the logic with new constants and types.
The description of higher order logic given in this chapter is not complete, though it does cover all the aspects of the logic important to an understanding of later chapters. This book is concerned more with specifications than with proofs, and verification of theorems will be left mainly to the reader's logical and mathematical intuition. This chapter therefore deals mostly with notation. See Gordon's paper [41] or the HOL system manual [47] for a full account of higher order logic, including a list of the primitive rules of inference and a set-theoretic semantics.
Continued advances in microelectronics have allowed hardware designers to build devices of remarkable size and complexity. With increasing size and complexity, however, it becomes increasingly difficult to ensure that these devices are free of design errors. Exhaustive simulation of even moderately-sized circuits is impossible, and partial simulation offers only partial assurance of correctness.
This is an especially serious problem in safety-critical applications, where failure due to design errors may cause loss of life or extensive damage. In these applications, functional errors in circuit designs cannot be tolerated. But even where safety is not the primary consideration, there may be important economic reasons for doing everything possible to eliminate design errors, and to eliminate them early in the design process. A flawed design may mean costly and time-consuming refabrication, and mass-produced devices may have to be recalled and replaced.
A solution to these problems is one of the goals of formal methods for verification of the correctness of hardware designs—sometimes just called hardware verification. With this approach, the behaviour of hardware devices is described mathematically, and formal proof is used to verify that they meet rigorous specifications of intended behaviour. The proofs can be very large and complex, so mechanized theorem-proving tools are often used to construct them.
Considerable progress has been made in this area in the past few years.
The notion of abstraction plays a central role in making formal proof an effective method for dealing with the problem of hardware correctness. This chapter explains how two important types of abstraction—which will be referred to as abstraction within a model of hardware behaviour and abstraction between models of hardware behaviour—can be expressed in higher order logic.
Abstraction within a model has to do with the way in which the correctness of individual designs is formulated. With the approach to hardware verification introduced in the previous chapter, correctness is stated by a proposition which asserts that some relationship of ‘satisfaction’ holds between the model of a circuit design and a specification of its intended behaviour. This relationship must, in general, be one of abstraction—it must relate a detailed model of an actual design to a more abstract specification of required behaviour. Sections 4.1–4.6 show how this notion of correctness as an abstraction relationship can be formalized in logic and incorporated into the method of hardware verification already introduced.
The second type of abstraction, called abstraction between models, is discussed in section 4.7. Here the concern is not with the correctness of individual designs, but with the relationship between two different collections of specifications for the primitive components used in all designs. One such collection can be an abstraction of another in the sense that it presents a more abstract view of the same primitive components.
This chapter describes the basic techniques for using higher order logic to specify hardware behaviour and to prove the correctness of hardware designs.
The advantages of higher order logic as a formalism for hardware verification are discussed by Gordon in [45] and by Hanna and Daeche in [57, 58]. Higher order logic makes available the results of general mathematics, and this allows one to construct any mathematical tools needed for the verification task in hand. Its expressive power permits hardware behaviour to be described directly in logic; a specialized hardware description language is not needed. In the formulation used here, new constants and types can be introduced by purely definitional means. This allows special-purpose notation for hardware verification to be introduced without the danger associated with postulating ad hoc axioms. In addition, the inference rules of the logic provide a secure basis for proofs of correctness; a specialized deductive calculus for reasoning about hardware behaviour is not required.
Although higher order logic has all these pragmatic advantages, to say that it is the only feasible formalism for hardware verification would be an exaggeration. Some other approaches axe briefly discussed at the end of this chapter. Furthermore, higher order logic does not make traditional hardware description languages (HDLs) obsolete. A major problem with these languages is that they usually lack a formal semantics, which precludes using them to reason about hardware behaviour.
The lexicon has come to occupy an increasingly central place in a variety of current linguistic theories, and it is equally important to work in natural language processing. The lexicon – the repository of information about words – has often proved to be a bottleneck in the design of large-scale natural language systems, given the tremendous number of words in the English language, coupled with the constant coinage of new words and shifts in the meanings of existing words. For this reason, there has been growing interest recently in building large-scale lexical knowledge bases automatically, or even semi-automatically, taking various on-line resources such as machine readable dictionaries (MRDs) and text corpora as a starting point, for instance, see the papers in Boguraev and Briscoe (1989) and Zernik (1989a). This chapter looks at the task of creating a lexicon from a different perspective, reviewing some of the advances in the understanding of the organization of the lexicon that have emerged from recent work in linguistics and sketching how the results of this work may be used in the design and creation of large-scale lexical knowledge bases that can serve a variety of needs, including those of natural language front ends, machine translation, speech recognition and synthesis, and lexicographers' and translators' workstations.
Although in principle on-line resources such as MRDs and text corpora would seem to provide a wealth of valuable linguistic information that could serve as a foundation for developing a lexical knowledge base, in practice it is often difficult to take full advantage of the information these existing resources contain.
This chapter presents an operational definition of computational lexicography, which is emerging as a discipline in its own right. In the context of one of its primary goals – facilitation of (semi-)automatic construction of lexical knowledge bases (aka computational lexicons) by extracting lexical data from on-line dictionaries – the concerns of dictionary analysis are related to those of lexical semantics. The chapter argues for a particular paradigm of lexicon construction, which relies crucially on having flexible access to fine-grained structural analyses of multiple dictionary sources. To this end, several related issues in computational lexicography are discussed in some detail.
In particular, the notion of structured dictionary representation is exemplified by looking at the wide range of functions encoded, both explicitly and implicitly, in the notations for dictionary entries. This allows the formulation of a framework for exploiting the lexical content of dictionary structure, in part encoded configurationally, for the purpose of streamlining the process of lexical acquisition.
A methodology for populating a lexical knowledge base with knowledge derived from existing lexical resources should not be in isolation from a theory of lexical semantics. Rather than promote any particular theory, however, we argue that without a theoretical framework the traditional methods of computational lexicography can hardly go further than highlighting the inadequacies of current dictionaries. We further argue that by reference to a theory that assumes a formal and rich model of the lexicon, dictionaries can be made to reveal – through guided analysis of highly structured isomorphs – a number of lexical semantic relations of relevance to natural language processing, which are only encoded implicitly and are distributed across the entire source.
One of the major resources in the task of building a large-scale lexicon for a natural-language system is the machine-readable dictionary. Serious flaws (for the user-computer) have already been documented in dictionaries being used as machine-readable dictionaries in natural language processing, including a lack of systematicity in the lexicographers' treatment of linguistic facts; recurrent omission of explicit statements of essential facts; and variations in lexicographical decisions which, together with ambiguities within entries, militate against successful mapping of one dictionary onto another and hence against optimal extraction of linguistic facts.
Large-scale electronic corpora now allow us to evaluate a dictionary entry realistically by comparing it with evidence of how the word is used in the real world. For various lexical items, an attempt is made to compare the view of word meaning that a corpus offers with the way in which this is presented in the definitions of five dictionaries at present available in machine-readable form and being used in natural language processing (NLP) research; corpus evidence is shown to support apparently incompatible semantic descriptions. Suggestions are offered for the construction of a lexical database entry to facilitate the mapping of such apparently incompatible dictionary entries and the consequent maximization of useful facts extracted from these.
How ‘reliable’ are dictionary definitions?
Writing a dictionary is a salutary and humbling experience. It makes you very aware of the extent of your ignorance in almost every field of human experience.
The structural units of phrasal intonation are frequently orthogonal to the syntactic constituent boundaries that are recognized by traditional grammar and embodied in most current theories of syntax. As a result, much recent work on the relation of intonation to discourse context and information structure has either eschewed syntax entirely (cf. Bolinger, 1972; Cutler and Isard, 1980; Gussenhoven, 1983; Brown and Yule, 1983), or has supplemented traditional syntax with entirely nonsyntactic string-related principles (cf. Cooper and Paccia-Cooper, 1980). Recently, Selkirk (1984) and others have postulated an autonomous level of “intonational structure” for spoken language, distinct from syntactic structure. Structures at this level are plausibly claimed to be related to discourse-related notions, such as “focus”. However, the involvement of two apparently uncoupled levels of structure in Natural Language grammar appears to complicate the path from speech to interpretation unreasonably, and thereby to threaten the feasibility of computational speech recognition and speech synthesis.
In Steedman (1991a), I argue that the notion of intonational structure formalized by Pierrehumbert, Selkirk, and others, can be subsumed under a rather different notion of syntactic surface structure, which emerges from the “Combinatory Categorial” theory of grammar (Steedman, 1987, 1990). This theory engenders surface structure constituents corresponding directly to phonological phrase structure. Moreover, the grammar assigns to these constituents interpretations that directly correspond to what is here called “information structure” – that is, the aspects of discourse-meaning that have variously been termed “topic” and “comment”, “theme” and “rheme”, “given” and “new” information, and/or “presupposition” and “focus”.
Throughout most of the 1960s, 1970s, and 1980s, computational linguistics enjoyed an excellent reputation. A sense of the promise of the work to society was prevalent, and high expectations were justified by solid, steady progress in research.
Nevertheless, by the close of the 1980s, many people openly expressed doubt about progress in the field. Are the problems of human language too hard to be solved even in the next eighty years? What measure(s) (other than the number of papers published) show significant progress in the last ten years? Where is the technology successfully deployed in the military, with revolutionary impact? In what directions should the field move, to ensure progress and avoid stagnation?
The symposium
A symposium on Future Directions in Natural Language Processing was held at Bolt Beranek and Newman, Inc. (BBN), in Cambridge, Massachusetts, from November 29, 1989, to December 1, 1989.
The symposium, sponsored by BBN's Science Development Program, brought together top researchers in a variety of areas to discuss the most significant problems and challenges that will face the field of computational linguistics in the next two to ten years. Speakers were encouraged to present both recently completed and speculative work, and to focus on topics that will have the most impact on the field in the coming decade. They were asked to reconsider unsolved problems of long standing as well as to present new opportunities. The purpose was to contribute to long-range planning by funding agencies, research groups, academic institutions, graduate students, and others interested in computational linguistics.
The thirty-six symposium attendees, who are listed following this preface, were all invited participants.