To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Introduction In setting up a lexical component for natural language processing systems, one finds that a considerable amount of information is often repeated across sets of word entries. To make the task of grammar writing more efficient, shared information can be expressed in the form of partially specified templates and distributed to relevant entries by inheritance. Shared information across sets of partially specified templates can be factored out and conveyed using the same technique. This makes it possible to avoid redefining the same information structures, thus reducing a great deal of redundancy in the specification of word forms. For example, general properties of intransitive verbs concerning subcategorization and argument structure can be simply stated once, and then inherited by lexical entries which provide word specific information, e.g. orthography, predicate sense, aktionsart, selectional restrictions. Likewise, properties which are common to all verbs (e.g. part of speech, presence of a subject) or subsets of the verb class (presence of a direct object for transitive and ditransitive verbs) can be defined as templates which subsume all members of the verb class or some subset of it. This approach to word specification provides a highly structured organization of the lexicon according to which the properties of related word types as well as the relation between word types and specific word forms are expressed in terms of structure sharing and inheritance (Flickinger, Pollard and Wasow, 1985; Flickinger, 1987; Pollard and Sag, 1987, pp. 191–209).
This chapter and those following describe the LKB, a lexical knowledge base system which has been designed as part of the ACQUILEX project to allow the representation of syntactic and semantic information semi-automatically extracted from machine readable dictionaries (MRDs) on a large scale. An overview of the ACQUILEX project is given by Briscoe (1991).
Although there has been previous work on building lexicons for Natural Language Processing (NLP) systems from MRDs (e.g. Carroll and Grover, 1989), most attempts at extracting semantic information have not made use of a formally defined representation language; typically a semantic network or a frame representation has been suggested, but the interpretation and functionality of the links has been left vague. Several networks based on taxonomies extracted from MRDs have been built (following Amsler, 1980) and these are useful for tasks such as sense-disambiguation, but are not directly utilisable as NLP lexicons. For a lexicon to be genuinely (re)usable, a declarative, formally specified, representation language is essential. A large lexicon has to be highly structured; it is necessary to be able to group lexical entries and to represent relationships between them, both in order to capture linguistic generalisations and to achieve consistency and conciseness. But, unless these notions of structure are properly specified, a lexicon based on them is in danger of being incomprehensible except (perhaps) to its creators.
In this chapter we discuss how the typed feature structure formalism described in the previous chapters is augmented with a default inheritance system. We first introduce our use of defaults informally and illustrate the sort of taxonomic data that motivated the design of our system. We then discuss some of the formal issues involved in introducing defaults into the representation language.
Taxonomies, Lexical Semantics and Default Inheritance
Our approach to default inheritance in the LKB has been largely motivated by consideration of the taxonomies which may be extracted automatically from MRDs, although the default inheritance mechanism can be used for other purposes, as discussed by Sanfilippo (this volume). In this section we introduce this concept of taxonomy, which is discussed in more detail by Vossen and Copestake (this volume). The notion of taxonomy that has been used in work on MRDs such as that by Amsler (1981), Chodorow etal. (1985) and Guthrie et al. (1990) is essentially an informal and intuitive one: a taxonomy is the network which results from connecting headwords with the genus terms in their definitions but the concept of genus term is not formally defined; however for noun definitions, which are all we will consider here, it is in general taken to be the syntactic head of the defining noun phrase (exceptions to this are discussed by Vossen and Copestake).
We characterise a notion of prioritised multiple inheritance (PMI) and contrast it with the more familiar orthogonal multiple inheritance (OMI). DATR is a knowledge representation language that was designed to facilitate OMI analyses of natural language lexicons: it contains no special purpose facility for PMI and this has led some researchers to conclude that PMI analyses are beyond the expressive capacity of DATR. Here, we present three different techniques for implementing PMI entirely within DATR's existing syntactic and semantic resources. In presenting them, we draw attention to their respective advantages and disadvantages.
Introduction
‘Multiple inheritance’, in inheritance network terminology, describes any situation where a node in an inheritance network inherits information from more than one other node in the network. Wherever this phenomenon occurs there is the potential for conflicting inheritance, i.e. when the information inherited from one node is inconsistent with that inherited from another. Because of this, the handling of multiple inheritance is an issue which is central to the design of any formalism for representing inheritance networks. For the formalism to be sound, it must provide a way of avoiding or resolving any conflict which might arise. This might be by banning multiple inheritance altogether, restricting it so that conflicts are avoided, providing some mechanism for conflict resolution as part of the formalism itself, or providing the user of the formalism with the means to specify how the conflict should be resolved.
This chapter describes – from a mathematical perspective – the system of typed feature structures used in the ACQUILEX Lexical Knowledge Base (LKB). We concentrate on describing the type system the LKB takes as input, making explicit the necessary conditions on the type hierarchy and explaining how – mathematically – our system of constraints works. It is assumed that the reader is familiar with basic unification-based formalisms like PATR-II, as explained in Shieber (1986). It must also be said from the start that our approach draws heavily on the work on typed feature structures by Carpenter (1990, 1992).
The LKB works basically through unification on (typed) feature structures. Since most of the time we deal with typed feature structures (defined in section 10.2) we will normally drop the qualifier and talk about feature structures. When necessary, to make a distinction, we refer to structures in PATR-II and similar systems as untyped feature structures. Feature structures are defined over a (fixed) finite set of features FEAT and over a (fixed) type hierarchy 〈TYPE, ⊑〉. Given FEAT and 〈TYPE, ⊑〉 we can define T the collection of all feature structures over FEAT and 〈TYPE, ⊑〉. But we are interested in feature structures which are well-formed with respect to a set of constraints. To describe constraints and well-formedness of feature structures we specify a function C: TYPE → F, which corresponds to an association of a constraint feature structure C(ti) to each type ti in the type hierarchy TYPE.
A stand-alone version of the LKB software system, with demonstration type systems, lexicons and so on is available for distribution; contact the authors for further details.
The Sussex DATR Implementation
The Sussex implementation of DATR comprises a compiler written in Prolog, and a wide-ranging collection of DATR example files. The compiler takes a DATR theory and produces Prolog code for query evaluation relative to that theory. The code is readily customisable, and customisations for Poplog Prolog, CProlog, Arity Prolog, Prolog2, Quintus Prolog and Sicstus Prolog are provided. Source listings, documentation and many of the example files may also be found in “The DATR Papers, Volume 1” (Cognitive Science Research Report 139).
This implementation is provided ‘as is’, with no warranty or support, and for research purposes only. Copyright remains with the University of Sussex.
The Prolog source code and the example files for the DATR system are available on a 720K 3.5 inch MS-DOS disk for £12.00 (within the UK) or US $25 (outside the UK) from: Technical Reports, School of Cognitive and Computing Sciences, University of Sussex, Brighton BN1 9QH, UK. “The DATR Papers, Volume 1” (Cognitive Science Research Report 139) is also available for £6 (US $ 12) from the same address.
The Traffic Information Collator (TIC) (Allport, 1988a,b) is a prototype system which takes verbatim police reports of traffic incidents, interprets them, builds a picture of what is happening on the roads and broadcasts appropriate messages automatically to motorists where necessary. Cahill and Evans (1990) described the process of converting the main TIC lexicon (a lexicon of around 1000 words specific to the domain of traffic reports) into DATR (Evans and Gazdar, 1989a,b; 1990). This chapter reviews the strategy adopted in the conversion discussed in that paper, and discusses the results of converting the whole lexicon, together with statistics comparing efficiency and performance between the original lexicon and the DATR version.
Introduction
The Traffic Information Collator (TIC) is a prototype system which takes verbatim police reports of traffic incidents, interprets them, builds a picture of what is happening on the roads and broadcasts appropriate messages automatically to motorists where necessary. In Cahill and Evans (1990), the basic strategy of defining the structure of lexical entries was described. That paper concentrated on the main TIC lexicon, which was just one part of a collection of different kinds of lexical information and only dealt with a small fragment even of that. The whole conversion involved collating all of that information into a single DATR description.
In this chapter, we address some issues in the design of declarative languages based on the notion of inheritance. First, we outline the connections and similarities between the notions of object, frame, conceptual graph and feature structures and we present a synthetic view of these notions. We then present the Typed Feature Structure (TFS) language developed at the University of Stuttgart, which reconciles the object-oriented approach with logic programming. We finally discuss some language design issues.
Convergences
Developing large NLP software is a very complex and time consuming task. The complexity of NLP can be characterized by the following two main factors:
NLP is data-intensive. Any NLP application needs large amounts of complex linguistic information. For example, a realistic application has typically dictionaries with tens of thousands of lexical entries.
Sophisticated NLP applications such as database interfaces or machine translation build very complex and intricate data structures for representing linguistic objects associated to strings of words. Part of the complexity also lies in the processing of such objects.
Object-oriented Approaches
An object-oriented approach to linguistic description addresses these two sources of complexity by providing:
facilities to manage the design process: data abstraction and inheritance.
facilities for capturing directly the interconnections and constraints in the data: properties, relations and complex objects. These features are common to object-oriented languages (OOL), objectoriented database management systems (OODBMS) or knowledge representation languages (KRL).
Theories of nonmonotonic reasoning are, on the face of it, of at least two sorts. In Circumscription, generic facts like Birds fly are taken to be essentially normative, and nonmonotonicity arises when individuals are assumed to be as normal as is consistent with available information about them. In theories like Default Logic such facts are taken to be rules of inference, and nonmonotonicity arises when available information is augmented by adding as many as possible of the inferences sanctioned by such rules. Depending on which of the two informal views is taken, different patterns of nonmonotonic reasoning are appropriate. Here it is shown that these different patterns of reasoning cannot be combined in a single theory of nonmonotonic reasoning.
Introduction
Nonmonotonic reasoning is that which lacks a monotonicity property which has been taken to be a characteristic of logical reasoning. In a theory of nonmonotonic reasoning, the consequences of a set of premises do not always accumulate as the set of premises is expanded. Nonmonotonicity has been researched in artificial intelligence because people reason nonmonotonically, and this in ways which seem directly related to intelligence. One important example of this is in commonsense reasoning about kinds, where jumping to conclusions enables intelligent agents to save time spent on gathering information.
In just what way is such jumping to logically invalid conclusions reasonable?
We present a definition of skeptical and credulous variants of default unification, the purpose of which is to add default information from one feature structure to the strict information given in another. Under the credulous definition, the default feature structure contributes as much information to the result as is consistent with the information in the strict feature structure. Credulous default unification turns out to be non-deterministic due to the fact that there may be distinct maximal subsets of the default information which may be consistently combined with the strict information. Skeptical default unification is obtained by restricting the default information to that which is contained in every credulous result. Both definitions are fully abstract in that they depend only on the information ordering of feature structures being combined and not on their internal structure, thus allowing them to be applied to just about any notion of feature structure and information ordering. We then consider the utility of default unification for constructing templates with default information and for defining how information is inherited in an inheritance-based grammar. In particular, we see how templates in the style of PATR-II can be defined, but conclude that such mechanisms are overly sensitive to order of presentation. Unfortunately, we only obtain limited success in applying default unification to simple systems of default inheritance. We follow the Common Lisp Object System-based approach of Russell et al.
In recent years, the lexicon has become the focus of considerable research in (computational) linguistic theory and natural language processing (NLP) research; the reasons for this trend are both theoretical and practical. Within linguistics, the role of the lexicon has become increasingly central as more and more linguistic generalisations have been seen to have a lexical dimension, whilst for NLP systems, the lexicon has increasingly become the chief ‘bottleneck’ in the production of habitable applications offering an adequate vocabulary for the intended task. This edited collection of essays derives from a workshop held in Cambridge in April 1991 to bring together researchers from both Europe and America and from both fields working on formal and computational accounts of the lexicon. The workshop was funded under the European Strategic Programme in Information Technology (ESPRIT) Basic Research Action (BRA) through the ACQUILEX project (‘Acquisition of Lexical Information for Natural Language Processing’) and was hosted by the Computer Laboratory, Cambridge University.
The ACQUILEX project is concerned with the exploitation of machine-readable versions of conventional dictionaries in an attempt to develop substantial lexicons for NLP in a resource efficient fashion. However, the focus of the workshop was on the representation and organisation of information in the lexicon, regardless of the mode of acquisition.
Auxiliaries (words such as must, shall, is) are central to English grammar. They are also puzzlingly complex in their behaviour. So much so that disagreement about their nature has been radical, and they remain a major area of difficulty. The interest of this area is compounded by the possibility of working out its history in some detail, given the abundant data available for earlier periods of English. It is clear that the area was much less well defined in earlier times; indeed it has even been claimed that modals were not to be distinguished grammatically from straightforward verbs at the earliest periods for which we have substantial records. Thus change has apparently been considerable. This means that the process of change will itself be important and interesting, and that we may achieve some insights into the nature of the modern category from an understanding of its development. So there is a major twofold challenge: to provide an appropriate synchronic account of English auxiliaries, and to show how this area of grammar developed historically. These are the twin challenges taken up in this book.
First, then, I have established a new account of the working of the modern auxiliary system. Its distinctive claim is that although auxiliaries carry apparently verbal categories such as ‘finite’ or ‘infinitive’ these are not inflectional in auxiliaries but are lexically specified, so that forms such as should or been behave in some respects like independent items. This leads to a simple, new and nonabstract account of a range of the properties of auxiliaries in terms of their distinctness from verbs, and to a fundamentally ‘lexical’ account of major properties of this ‘grammatical’ area.
This book aims to give an account of the grammar and history of English auxiliaries, that is of words like those italicized here:
(1) Could John have written it if Mary didn't? – No, it wasn ‘t written by a man.
Since this group includes words associated with modality, aspect, tense and voice (as in could, have, didn't, wasn't) they have often been labelled ‘auxiliary’ or ‘helping’ verbs, where an auxiliary is ‘a verb used to form the tenses, moods, voices, etc. of other verbs’ (OED Auxiliary, a. and sb. B sb. 3). This terminology encodes the traditional view that such properties are fundamentally those of verbs, as they are (for example) in the Latin one-word forms cantabo, cantarem, cantabatur in contrast with the corresponding English (I) shall sing, (I) might sing, (it) was being sung.
The problems of the present-day analysis and the historical development of this group of words have been a major area for discussion and disagreement in recent years. In this book I will present and justify new analyses in both structure and history. In the first half of the book I will argue that the most appropriate characterization of some of the major idiosyncrasies of the English auxiliary system follows directly from the nature of the categorial relationship between auxiliary and full verb. Auxiliaries do not share morphosyntactic generalizations appropriate to full verbs. Instead we need a fundamentally lexical account of the interrelationships between their categories. This insight leads to a fresh and illuminating account of ordering restrictions on English auxiliaries, of restrictions on the availability of their morphosyntactic categories, of their distribution in ellipsis and of some other individual properties.