Search results for Artificial Intelligence and Natural Language Processing

A decision-theoretic framework for the evaluation of language models used in speech recognizers
J. R. DELLER, K. H. DESAI, Y. P. YANG
Journal:

Natural Language Engineering / Volume 11 / Issue 4 / December 2005

Published online by Cambridge University Press:

10 November 2005, pp. 363-396
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
An analytical method for design and performance analysis of language models (LM) is described, and an example interactive software tool based on the technique is demonstrated. The LM performance analysis does not require on-line simulation or experimentation with the recognition system in which the LM is to employed. By exploiting parallels with signal detection theory, a profile of the LM as a function of the design parameters is given in a set of curves analogous to a receiver-operating-characteristic display.

Comparing example-based and statistical machine translation
ANDY WAY, NANO GOUGH
Journal:

Natural Language Engineering / Volume 11 / Issue 3 / September 2005

Published online by Cambridge University Press:

21 September 2005, pp. 295-309
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In previous work (Gough and Way 2004), we showed that our Example-Based Machine Translation (EBMT) system improved with respect to both coverage and quality when seeded with increasing amounts of training data, so that it significantly outperformed the on-line MT system Logomedia according to a wide variety of automatic evaluation metrics. While it is perhaps unsurprising that system performance is correlated with the amount of training data, we address in this paper the question of whether a large-scale, robust EBMT system such as ours can outperform a Statistical Machine Translation (SMT) system. We obtained a large English-French translation memory from Sun Microsystems from which we randomly extracted a near 4K test set. The remaining data was split into three training sets, of roughly 50K, 100K and 200K sentence-pairs in order to measure the effect of increasing the size of the training data on the performance of the two systems. Our main observation is that contrary to perceived wisdom in the field, there appears to be little substance to the claim that SMT systems are guaranteed to outperform EBMT systems when confronted with ‘enough’ training data. Our tests on a 4.8 million word bitext indicate that while SMT appears to outperform our system for French-English on a number of metrics, for English-French, on all but one automatic evaluation metric, the performance of our EBMT system is superior to the baseline SMT model.

Automatic bilingual lexicon acquisition using random indexing of parallel corpora
M. SAHLGREN, J. KARLGREN
Journal:

Natural Language Engineering / Volume 11 / Issue 3 / September 2005

Published online by Cambridge University Press:

21 September 2005, pp. 327-341
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This paper presents a very simple and effective approach to using parallel corpora for automatic bilingual lexicon acquisition. The approach, which uses the Random Indexing vector space methodology, is based on finding correlations between terms based on their distributional characteristics. The approach requires a minimum of preprocessing and linguistic knowledge, and is efficient, fast and scalable. In this paper, we explain how our approach differs from traditional cooccurrence-based word alignment algorithms, and we demonstrate how to extract bilingual lexica using the Random Indexing approach applied to aligned parallel data. The acquired lexica are evaluated by comparing them to manually compiled gold standards, and we report overlap of around 60%. We also discuss methodological problems with evaluating lexical resources of this kind.

Parallel texts
RADA MIHALCEA, MICHEL SIMARD
Journal:

Natural Language Engineering / Volume 11 / Issue 3 / September 2005

Published online by Cambridge University Press:

21 September 2005, pp. 239-246
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Parallel texts have become a vital element for natural language processing. We present a panorama of current research activities related to parallel texts, and offer some thoughts about the future of this rich field of investigation.

Optimization of word alignment clues
JÖRG TIEDEMANN
Journal:

Natural Language Engineering / Volume 11 / Issue 3 / September 2005

Published online by Cambridge University Press:

21 September 2005, pp. 279-293
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Statistical, linguistic, and heuristic clues can be used for the alignment of words and multi-word units in parallel texts. This article describes the clue alignment approach and the optimization of its parameters using a genetic algorithm. Word alignment clues can come from various sources such as statistical alignment models, co-occurrence tests, string similarity scores and static dictionaries. A genetic algorithm implementing an evolutionary procedure can be used to optimize the parameters necessary for combining available clues. Experiments on English/Swedish bitext show a significant improvement of about 6% in F-scores compared to the baseline produced by statistical word alignment.Most of the work described in this paper was carried out at the Department of Linguistics and Philology at Uppsala University. I would like to acknowledge technical and scientific support by people at the department in Uppsala.

Bootstrapping parsers via syntactic projection across parallel texts
REBECCA HWA, PHILIP RESNIK, AMY WEINBERG, CLARA CABEZAS, OKAN KOLAK
Journal:

Natural Language Engineering / Volume 11 / Issue 3 / September 2005

Published online by Cambridge University Press:

21 September 2005, pp. 311-325
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Broad coverage, high quality parsers are available for only a handful of languages. A prerequisite for developing broad coverage parsers for more languages is the annotation of text with the desired linguistic representations (also known as “treebanking”). However, syntactic annotation is a labor intensive and time-consuming process, and it is difficult to find linguistically annotated text in sufficient quantities. In this article, we explore using parallel text to help solving the problem of creating syntactic annotation in more languages. The central idea is to annotate the English side of a parallel corpus, project the analysis to the second language, and then train a stochastic analyzer on the resulting noisy annotations. We discuss our background assumptions, describe an initial study on the “projectability” of syntactic relations, and then present two experiments in which stochastic parsers are developed with minimal human intervention via projection from English.

Exploiting parallel texts in the creation of multilingual semantically annotated resources: the MultiSemCor Corpus
L. BENTIVOGLI, E. PIANTA
Journal:

Natural Language Engineering / Volume 11 / Issue 3 / September 2005

Published online by Cambridge University Press:

21 September 2005, pp. 247-261
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In this article we illustrate and evaluate an approach to create high quality linguistically annotated resources based on the exploitation of aligned parallel corpora. This approach is based on the assumption that if a text in one language has been annotated and its translation has not, annotations can be transferred from the source text to the target using word alignment as a bridge. The transfer approach has been tested and extensively applied for the creation of the MultiSemCor corpus, an English/Italian parallel corpus created on the basis of the English SemCor corpus. In MultiSemCor the texts are aligned at the word level and word sense annotated with a shared inventory of senses. A number of experiments have been carried out to evaluate the different steps involved in the methodology and the results suggest that the transfer approach is one promising solution to the resource bottleneck. First, it leads to the creation of a parallel corpus, which represents a crucial resource per se. Second, it allows for the exploitation of existing (mostly English) annotated resources to bootstrap the creation of annotated corpora in new (resource-poor) languages with greatly reduced human effort.

Constrained EM for parallel text alignment
DAVID TALBOT
Journal:

Natural Language Engineering / Volume 11 / Issue 3 / September 2005

Published online by Cambridge University Press:

21 September 2005, pp. 263-277
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Standard parameter estimation schemes for statistical translation models can struggle to find reasonable settings on some parallel corpora. We show how auxiliary information can be used to constrain the procedure directly by restricting the set of alignments explored during parameter estimation. This enables the integration of bilingual and monolingual knowledge sources while retaining the flexibility of the underlying models. We demonstrate the effectiveness of this approach for incorporating linguistic and domain-specific constraints on various parallel corpora, and consider the importance of using the context of the parallel text to guide the application of such constraints.

5 - Application to shallow parsing
Walter Daelemans, Antal van den Bosch
Book:

Memory-Based Language Processing

Published online:

22 September 2009

Print publication:

01 September 2005, pp 85-103
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The goal of this chapter is to show that even complex recursive NLP tasks such as parsing (assigning syntactic structure to sentences using a grammar, a lexicon and a search algorithm) can be redefined as a set of cascaded classification problems with separate classifiers for tagging, chunk boundary detection, chunk labeling, relation finding, etc. In such an approach, input vectors represent a focus item and its surrounding context, and output classes represent either a label of the focus (e.g., part of speech tag, constituent label, type of grammatical relation) or a segmentation label (e.g., start or end of a constituent). In this chapter, we show how a shallow parser can be constructed as a cascade of MBLP-classifiers and introduce software that can be used for the development of memory-based taggers and chunkers.
Although in principle full parsing could be achieved in this modular, classification-based way (see section 5.5), this approach is more suited for shallow parsing. Partial or shallow parsing, as opposed to full parsing, recovers only a limited amount of syntactic information from natural language sentences. Especially in applications such as information retrieval, question answering, and information extraction, where large volumes of, often ungrammatical, text have to be analyzed in an efficient and robust way, shallow parsing is useful. For these applications a complete syntactic analysis may provide too much or too little information.

3 - Memory and Similarity
Walter Daelemans, Antal van den Bosch
Book:

Memory-Based Language Processing

Published online:

22 September 2009

Print publication:

01 September 2005, pp 26-56
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

An MBLP system as introduced in the previous chapters has two components: a learning component which is memory-based, and a performance component which is similarity-based. The learning component is memorybased as it involves storing examples in memory (also called the instance base or case base) without abstraction, selection, or restructuring. In the performance component of an MBLP system the stored examples are used as a basis for mapping input to output; input instances are classified by assigning them an output label. During classification, a previously unseen test instance is presented to the system. The class of this instance is determined on the basis of an extrapolation from the most similar example(s) in memory. There are different ways in which this approach can be operationalized. The goal of this chapter is twofold: to provide a clear definition of the operationalizations we have found to work well for NLP tasks, and to provide an introduction to TIMBL, a software package implementing all algorithms and metrics discussed in this book. The emphasis on hands-on use of software in a book such as this deserves some justification. Although our aims are mainly theoretical in showing that MBLP has the right bias for solving NLP tasks on the basis of argumentation and experiment, we believe that the strengths and limitations of any algorithm can only be understood in sufficient depth by experimenting with this specific algorithm.

1 - Memory-Based Learning in Natural Language Processing
Walter Daelemans, Antal van den Bosch
Book:

Memory-Based Language Processing

Published online:

22 September 2009

Print publication:

01 September 2005, pp 3-14
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This book presents a simple and efficient approach to solving natural language processing problems. The approach is based on the combination of two powerful techniques: the efficient storage of solved examples of the problem, and similarity-based reasoning on the basis of these stored examples to solve new ones.
Natural language processing (NLP) is concerned with the knowledge representation and problem solving algorithms involved in learning, producing, and understanding language. Language technology, or language engineering, uses the formalisms and theories developed within NLP in applications ranging from spelling error correction to machine translation and automatic extraction of knowledge from text.
Although the origins of NLP are both logical and statistical, as in other disciplines of artificial intelligence, historically the knowledge-based approach has dominated the field. This has resulted in an emphasis on logical semantics for meaning representation, on the development of grammar formalisms (especially lexicalist unification grammars), and on the design of associated parsing methods and lexical representation and organization methods. Well-known textbooks such as Gazdar and Mellish (1989) and Allen (1995) provide an overview of this ‘rationalist’ or ‘deductive’ approach.
The approach in this book is firmly rooted in the alternative empirical (inductive) approach. From the early 1990s onwards, empirical methods based on statistics derived from corpora have been adopted widely in the field. There were several reasons for this.

Bibliography
Walter Daelemans, Antal van den Bosch
Book:

Memory-Based Language Processing

Published online:

22 September 2009

Print publication:

01 September 2005, pp 168-185
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Preface
Walter Daelemans, Antal van den Bosch
Book:

Memory-Based Language Processing

Published online:

22 September 2009

Print publication:

01 September 2005, pp 1-2
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This book is a reflection of about twelve years of work on memory-based language processing. It reflects on the central topic from three perspectives. First, it describes the influences from linguistics, artificial intelligence, and psycholinguistics on the foundations of memory-based models of language processing. Second, it highlights applications of memory-based learning to processing tasks in phonology and morphology, and in shallow parsing. Third, it ventures into answering the question why memory-based learning fills a unique role in the larger field of machine learning of natural language – because it is the only algorithm that does not abstract away from its training examples. In addition, we provide tutorial information on the use of TIMBL, a software package for memory-based learning, and an associated suite of software tools for memory-based language processing.
For us, the direct inspiration for starting to experiment with extensions of the k-nearest neighbor classifier to language processing problems was the successful application of the approach by Stanfill and Waltz to grapheme-to-phoneme conversion in the eighties. During the past decade we have been fortunate to have expanded our work with a great team of fellow researchers and students on memory-based language processing in two locations: the ILK (Induction of Linguistic Knowledge) research group at Tilburg University, and CNTS (Center for Dutch Language and Speech) at the University of Antwerp. Our own first implementations of memory-based learning were soon superseded by well-coded software systems by Peter Berck, Jakub Zavrel, Bertjan Busser, and Ko van der Sloot.

Index
Walter Daelemans, Antal van den Bosch
Book:

Memory-Based Language Processing

Published online:

22 September 2009

Print publication:

01 September 2005, pp 186-189
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

4 - Application to morpho-phonology
Walter Daelemans, Antal van den Bosch
Book:

Memory-Based Language Processing

Published online:

22 September 2009

Print publication:

01 September 2005, pp 57-84
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

As argued in chapter 1, if a natural language processing task is formulated as either a disambiguation task or a segmentation task, it can be presented as a classification task to a memory-based learner, as well as to any other machine learning algorithm capable of learning from labeled examples. In this chapter as well as in the next we provide examples of how we formulate tasks in an MBLP framework. We start with one disambiguation and one segmentation task operating at the phonological and morphological levels, respectively.
A non-trivial portion of the complexity of natural languages is determined at the phonological and morphological levels, where phonemes and morphemes come together to form words. A language's phoneme inventory is based on many individual observations in which changing one particular speech sound of a spoken word into another changes the meaning of the word. A morpheme is usually identified as a string of phonemes carrying meaning on its own; a special class of morphemes, affixes, does not carry meaning on its own, but instead affixes have the ability to add or change some aspect of meaning when attached to a morpheme or string of morphemes.
One major problem of natural language processing in the phonological and morphological domains is that many existing sequences of phonemes and morphemes have highly ambiguous surface written forms, and especially in alphabetic writing systems where there is ambiguity in the relation between letters and phonemes.

6 - Abstraction and generalization
Walter Daelemans, Antal van den Bosch
Book:

Memory-Based Language Processing

Published online:

22 September 2009

Print publication:

01 September 2005, pp 104-147
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The concepts of abstraction and generalization are tightly coupled to Ockham's razor, a medieval scientific principle, which is still regarded in many branches of modern science as fundamentally true. Sources quote the principle as “non preterio necessitate delendam”, or freely translated in the imperative form, delete all elements in a theory that are not necessary. The goal of its application is to maximize economy and generality: it favors small theories over large ones, when they have the same expressive power. The latter can be read as ‘having the same generalization accuracy’, which, as we have exemplified in the previous chapters, can be estimated through validation tests with held-out material.
A twentieth-century incarnation of Ockham's razor is the minimal description length (MDL) principle (Rissanen, 1983), coined in the context of computational learning theory. It has been used as the leading principle in the design of decision tree induction algorithms such as C4.5 (Quinlan, 1993) and rule induction algorithms such as RIPPER (Cohen, 1995). The goal of these algorithms is to find a compact representation of the classification information in the given learning material that at the same time generalizes well to unseen material. C4.5 uses decision trees; RIPPER uses ordered lists of rules to meet that end.
In contrast, memory-based learning is not minimal – its description length is equal to the amount of memory it takes to store the learning examples. Keeping all learning examples in memory is all but economical.

2 - Inspirations from linguistics and artificial intelligence
Walter Daelemans, Antal van den Bosch
Book:

Memory-Based Language Processing

Published online:

22 September 2009

Print publication:

01 September 2005, pp 15-25
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Memory-Based Language Processing, MBLP, is based on the idea that learning and processing are two sides of the same coin. Learning is the storage of examples in memory, and processing is similarity-based reasoning with these stored examples. Although we have developed a specific operationalization of these ideas, they have been around for a long time. In this chapter we provide an overview of similar ideas in linguistics, psychology, and computer science, and end with a discussion of the crucial lesson learned from this literature, namely, that generalization from experience to new decisions is possible without the creation of abstract representations such as rules.
Inspirations from linguistics
While the rise of Chomskyan linguistics in the 1960s is considered a turning point in the development of linguistic theory, it is mostly before this time that we find explicit and sometimes adamant arguments for the use of memory and analogy that explain both the acquisition and the processing of linguistic knowledge in humans. We compress this into a brief review of thoughts and arguments voiced by the likes of Ferdinand de Saussure, Leonard Bloomfield, John Rupert Firth, Michael Halliday, Zellig Harris, and Royal Skousen, and we point to related ideas in psychology and cognitive linguistics.

Frontmatter
Walter Daelemans, Antal van den Bosch
Book:

Memory-Based Language Processing

Published online:

22 September 2009

Print publication:

01 September 2005, pp i-iv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Contents
Walter Daelemans, Antal van den Bosch
Book:

Memory-Based Language Processing

Published online:

22 September 2009

Print publication:

01 September 2005, pp v-viii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

7 - Extensions
Walter Daelemans, Antal van den Bosch
Book:

Memory-Based Language Processing

Published online:

22 September 2009

Print publication:

01 September 2005, pp 148-167
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter describes two complementary extensions to memory-based learning: a search method for optimizing parameter settings, and methods for reducing the near-sightedness of the standard memory-based learner to its own contextual decisions in sequence processing tasks. Both complement the core algorithm as we have been discussing so far. Both methods have a wider applicability than just memory-based learning, and can be combined with any classification-based supervised learning algorithm.
First, in section 7.1 we introduce a search method for finding optimal algorithmic parameter settings. No universal rules of thumb exist for setting parameters such as the k in the k-NN classification rule, or the feature weighting metric, or the distance weighting metric. They also interact in unpredictable ways. Yet, parameter settings do matter; they can seriously change generalization performance on unseen data. We show that applying heuristic search methods in an experimental wrapping environment (in which a training set is further divided into training and validation sets) can produce good parameter settings automatically.
Second, in section 7.2 we describe two technical solutions to the problem of “sequence near-sightedness” from which many machine-learning classifiers and stochastic models suffer that predict class symbols without coordinating one prediction with another in some way. When such a classifier is performing natural language sequence tasks, producing class symbol by class symbol, it is unable to stop itself from generating output sequences that are impossible and invalid, because information on the output sequence being generated is not available to the learner.

Artificial Intelligence and Natural Language Processing

Refine search

Refine search

Actions for selected content:

3312 results in Artificial Intelligence and Natural Language Processing

A decision-theoretic framework for the evaluation of language models used in speech recognizers

Comparing example-based and statistical machine translation

Automatic bilingual lexicon acquisition using random indexing of parallel corpora

Parallel texts

Optimization of word alignment clues

Bootstrapping parsers via syntactic projection across parallel texts

Exploiting parallel texts in the creation of multilingual semantically annotated resources: the MultiSemCor Corpus

Constrained EM for parallel text alignment

5 - Application to shallow parsing

Summary

3 - Memory and Similarity

Summary

1 - Memory-Based Learning in Natural Language Processing

Summary

Bibliography

Preface

Summary

Index

4 - Application to morpho-phonology

Summary

6 - Abstraction and generalization

Summary

2 - Inspirations from linguistics and artificial intelligence

Summary

Frontmatter

Contents

7 - Extensions

Summary

Artificial Intelligence and Natural Language Processing

Refine search

Refine search

Actions for selected content:

Save Search

3312 results in Artificial Intelligence and Natural Language Processing

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary