Skip to main content Accessibility help
×
Hostname: page-component-7857688df4-7f72j Total loading time: 0 Render date: 2025-11-18T18:14:03.342Z Has data issue: false hasContentIssue false

4 - Data Bias and the Natural Language Processing of Metadata

Published online by Cambridge University Press:  13 September 2025

Paul Gooding
Affiliation:
University of Glasgow
Melissa Terras
Affiliation:
University of Edinburgh
Get access

Summary

Introduction

Scholarship on library catalogues and cataloguing practices provides numerous examples of how catalogue metadata reflect social biases regarding gender, racialised ethnicity, religion and sexuality, among others. This chapter takes a higher level view of library catalogues’ metadata, examining the numerous actors and practices that influence the creation and interpretation of catalogue metadata. Drawing on the work of critical heritage studies and feminism scholars in Europe, North America and Australia, I discuss the power exerted in the creation of catalogue metadata and the implications of such power relationships for research using these metadata. Acknowledging the inevitability of bias in data, I describe opportunities to manage catalogue metadata biases using natural language processing (NLP). Though bias cannot be eliminated, when critically applied, NLP methods can make the perspectives included and excluded in a catalogue explicit, informing research that aims to understand, use and improve library catalogues’ metadata and, in turn, the information discovery process. Due to overlaps in the cataloguing practices of GLAM institutions (galleries, libraries, archives and museums), this chapter refers to work relevant to the GLAM sector as a whole in addition to libraries specifically.

Digital and online technologies are enabling broader access to heritage in long-established libraries, such as the National Library of Scotland, through its Data Foundry platform (https://data.nls.uk).

Information

Type
Chapter
Information
Library Catalogues as Data
Research, Practice and Usage
, pp. 61 - 84
Publisher: Facet
Print publication year: 2025

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Accessibility standard: Unknown

Why this information is here

This section outlines the accessibility features of this content - including support for screen readers, full keyboard navigation and high-contrast display options. This may not be relevant for you.

Accessibility Information

Accessibility compliance for the PDF of this book is currently unknown and may be updated in the future.

Save book to Kindle

To save this book to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×