Search

Summary

One of the key advantages of transformer networks is the ability to take a model that was pretrained over vast quantities of text and fine-tune it for the task at hand. Intuitively, this strategy allows transformer networks to achieve higher performance on smaller datasets by relying on statistics acquired at scale in an unsupervised way (e.g., through the masked language model training objective). To this end, in this chapter, we will use the Hugging Face library, which has a rich repository of datasets and pretrained models, as well as helper methods and classes that make it easy to target downstream tasks. Using pretrained transformer encoders, we will implement the two tasks that served as use cases in the previous chapters: text classification and part-of-speech tagging.

Summary

In this chapter, we implement a machine translation application as an example of an encoder-decoder task. In particular, we build on pretrained encoder-decoder transformer models, which exist in the Hugging Face library for a wide variety of language pairs. We first show how to use one of these models out-of-the-box to perform translation for one of the language pairs it has been exposed to during pretraining: English to Romanian. Afterward, we fine-tune the model to a new language combination that is has not seen before: Romanian to English. In both use cases, we use the T5 encoder-decoder model, which has been pretrained for several tasks, including machine translation.

Search Results

Refine search

Refine search

Actions for selected content:

2 results

13 - Using Transformers with the Hugging Face Library

Summary

15 - Implementing Encoder-Decoder Methods

Summary

Search Results

Refine search

Refine search

Actions for selected content:

Save Search

2 results

13 - Using Transformers with the Hugging Face Library

Summary

15 - Implementing Encoder-Decoder Methods

Summary