To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In emergency care settings, there is a crucial need for automated translation tools. We focus here on the BabelDr system, a speech-enabled fixed-phrase translator used to improve communication in emergency settings between doctors and allophone patients. The aim of the chapter is two-fold. First, we will assess if a bidirectional version of the phraselator allowing patients to answer doctors’ questions by selecting pictures from open-source databases will improve user satisfaction. Second, we wish to evaluate pictograph usability in this context. Our hypotheses are that images will in fact help to improve patient satisfaction and that multiple factors influence pictograph usability. Factors of interest include not only the comprehensibility of the pictographs per se, but also how the images are presented to the user with respect to their number and ordering. We showed that most respondents prefer to use the interface with pictographs and that multiple factors influence participants’ ability to find a pictograph based on a written form, but that the comprehensibility of the individual pictographs is probably the most important.
This chapter explains significant speech and translation technologies for healthcare professionals. We first examine the progress of automatic speech recognition (ASR) and text-to-speech (TTS). Turning to machine translation (MT), we briefly cover fixed-phrase-based translation systems (“phraselators”), with consideration of their advantages and disadvantages. The major types of full (wide-ranging, relatively unrestricted) MT – symbolic, statistical, and neural – are then explained in some detail. As an optional bonus, we provide an extended explanation of transformer-based neural translation. We postpone for a separate chapter discussion of practical applications in healthcare contexts of speech and translation technologies.
In this chapter, we review random encoding models that directly reduce the dimensionality of distributional data without first building a co-occurrence matrix. While matrix distributional semantic models (DSMs) output either explicit or implicit distributional vectors, random encoding models only produce low-dimensional embeddings, and emphasize efficiency, scalability, and incrementality in building distributional representations. We discuss the mathematical foundation for models based on random encoding, the Johnson-Lindenstrauss lemma. We introduce Random Projection, before turning to Random Indexing and BEAGLE, a random encoding model that encodes sequential information in distributional vectors. Then, we introduce a variant of Random Indexing that uses random permutations to represent the position of the context lexemes with respect to the target, similarly to BEAGLE. Finally, we discuss Self-Organizing Maps, a kind of unsupervised neural network that shares important similarities with random encoding models.
Distributional semantics is the study of how distributional information can be used to model semantic facts. Its theoretical foundation has become known as the Distributional Hypothesis: Lexemes with similar linguistic contexts have similar meanings. This chapter presents the epistemological principles of distributional semantics. First, we explore the historical roots of the Distributional Hypothesis, tracing them in several different theoretical traditions, including European structuralism, American distributionalism, the later philosophy of Ludwig Wittgenstein, corpus linguistics, and behaviorist and cognitive psychology. Then, we discuss the place of distributional semantics in theoretical and computational linguistics.
The most recent development in distributional semantics is represented by models based on artificial neural networks. In this chapter, we focus on the use of neural networks to build static embeddings. Like random encoding models, neural networks incrementally learn embeddings by reducing the high dimensionality of distributional data without building an explicit co-occurrence matrix. Differing from the first generation of distributional semantic models (DSMs), also termed count models, the distributional representations produced by neural DSMs are the by-product of training the network to predict neighboring words, hence the name of predict models. Since semantically similar words tend to co-occur with similar contexts, the network learns to encode similar lexemes with similar distributional vectors. After introducing the basic concepts of neural computation, we illustrate neural language models and their use to learn distributional representations. Then we pass to describe the most popular static neural DSMs, CBOW, and Skip-Gram. We conclude the chapter with a comparison between count and predict models.
Cross-language communication in healthcare is urgently needed. However, while development of the relevant linguistic technologies and related infrastructure has been accelerating, widespread adoption of translation- and speech-enabled systems remains slow. This chapter examines obstacles to adoption and directions for overcoming them, with emphasis on reliability and customization issues; discusses two major types of speech translation systems and their respective approaches to the same obstacles; surveys some healthcare-oriented communication systems, past and future; and concludes with an optimistic forecast for speech and translation applications in healthcare, tempered by due cautions.
This book offers state-of-the-art research on the design and evaluation of assistive translation tools and systems to facilitate cross-cultural and cross-lingual communications in health and medical settings. This book illustrates using case studies important principles of designing assistive health communication tools which are (1) detectability of errors to boost user confidence by health professionals; (2) adaptability or customizability for health and medical domains; (3) inclusivity of translation modalities (written, speech, sign language) to serve people with disabilities; and (4) equality of accessibility standards for localised multilingual websites of health contents. To summarize these key principles for promotion of accessible and reliable translation technology, we use the acronym I-D-E-A.
This chapter focuses on the evaluation of distributional semantic models (DSMs). Distributional semantics has usually favored intrinsic methods that test DSMs for their ability to model various kinds of semantic similarity and relatedness. Recently, extrinsic evaluation has also become very popular: the distributional vectors are fed into a downstream NLP task and are evaluated with the system’s performance. The goal of this chapter is twofold: (i) to present the most common evaluation methods in distributional semantics, and (ii) to carry out a large-scale comparison between the static DSMs reviewed in Part II. First, we discuss the notion of semantic similarity, which is central in distributional semantics. Then, we present the major tasks for intrinsic and extrinsic evaluation, and we analyze the performance of a representative group of static DSMs on several semantic tasks. Finally, we explore the differences of the semantic spaces produced by these models with Representational Similarity Analysis.
This chapter discusses the major types of matrix models, a rich and multifarious family of distributional semantic models (DSMs) that extend and generalize the vector space model in information retrieval from which they derive the use of co-occurrence matrices to represent distributional information. We first focus on a group of matrix DSMs (e.g., Latent Semantic Analysis) that we refer to as classical models, since they directly implement the basic procedure to build distributional representations introduced in Chapter 2. Then, we present DSMs that propose extensions and variants to classical ones. Latent Relational Analysis uses pairs of lexical items as targets to measure the semantic similarity of the relations between them. Distributional Memory represents distributional data with a high-order tensor, from which different types of co-occurrence matrices are derived to address various semantic tasks. Topic Models and GloVe introduce new approaches to reduce the dimensionality of the co-occurrence matrix, respectively based on probabilistic inference and a method strongly inspired by neural DSMs.