Visual diagnostics for female genital schistosomiasis and the opportunity for improvement using computer vision

Morgan E. Lemin; Amaya L. Bustinduy; Chrissy h. Roberts

doi:10.1017/S0031182025100826

Visual diagnostics for female genital schistosomiasis and the opportunity for improvement using computer vision

Published online by Cambridge University Press: 12 September 2025

and

Morgan E. Lemin*: Affiliation:
Department of Clinical Research, London School of Hygiene and Tropical Medicine, London, UK
Amaya L. Bustinduy: Affiliation:
Department of Clinical Research, London School of Hygiene and Tropical Medicine, London, UK
Chrissy h. Roberts: Affiliation:
Department of Clinical Research, London School of Hygiene and Tropical Medicine, London, UK
*: Corresponding author: Morgan E. Lemin; Email: morgan.lemin@lshtm.ac.uk

Article contents

Abstract
Introduction
What are the main challenges to using computer vision for female genital schistosomiasis?
Troubleshooting the challenges in FGS computer vision
Implementation
A pathway forward
Author’s contributions
Financial support
Competing interests
Ethical standards
References

Rights & Permissions

Abstract

Female genital schistosomiasis (FGS) is a chronically disabling gynaecological condition, impacting up to 56 million women and girls, mostly in sub-Saharan Africa. In lieu of a gold standard laboratory test, it is possible to diagnose FGS visually. Visual diagnosis is performed through inspection of the cervix and surrounding tissue to identify signs of Schistosoma egg deposition, associated inflammation and granuloma formation. The change related to egg deposition can be very subtle and heterogeneous and is often seen in the context of other altered cervical morphology. Visual diagnostics for FGS are therefore currently highly subjective and lack specificity, with low consistency of grading between trained expert reviewers. Computer vision, driven by artificial intelligence, is an enticing prospect to overcome these issues due to the potential to accurately detect and classify the subtle changes and patterns that are indiscernible to human graders. Computer vision also offers the opportunity to support resource-constrained regions with few staff trained on visual diagnostics. However, several challenges stand in the way of progressing and successfully implementing computer vision tools for FGS. These challenges are particularly related to the variation in the appearance of the cervix (with or without disease) and FGS lesions, as well as the difficulty with accurately labelling cervical images. Exploring alternative annotation methods and model architectures is likely to improve the performance of FGS computer vision tools. This paper will explore the challenges of FGS computer vision and provide suggestions on how to overcome these barriers to enhance visual diagnostics for FGS.

Keywords

artificial intelligence computer vision diagnostics female genital schistosomiasis FGS visual

Information

Type: Review Article
Information: Parasitology , First View , pp. 1 - 12

DOI: https://doi.org/10.1017/S0031182025100826 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2025. Published by Cambridge University Press.

Introduction

Female genital schistosomiasis (FGS) is a chronically disabling gynaecological condition estimated to affect up to 56 million women and girls, predominantly within sub-Saharan Africa (Bustinduy et al. Reference Bustinduy, Randriansolo, Sturt, Kayuni, Leustcher, Webster, Van Lieshout, Stothard, Feldmeier and Gyapong2022). FGS occurs when Schistosoma eggs become trapped in the tissue of the genital reproductive tract (Kjetland et al. Reference Kjetland, Poggensee, Helling-Giese, Richter, Sjaastad, Chitsulo, Kumwenda, Gundersen, Krantz and Feldmeier1996). The body’s intense inflammatory response to the presence of the eggs, and attempts to contain the infection with the formation of granulomas, is the cause of most of the associated morbidity (Kjetland et al. Reference Kjetland, Ndhlovu, Mduluza, Gomo, Gwanzura, Mason, Kurewa, Midzi, Friis and Gundersen2005). As part of this inflammatory response, characteristic lesions begin to form throughout the reproductive tract, including on the cervical mucosal surface and through the vaginal canal (Kjetland et al. Reference Kjetland, Norseth, Taylor, Lillebø, Kleppa, Holmen, Andebirhan, Yohannes, Gundersen, Vennervald, Bagratee, Onsrud and Leutscher2014; Randrianasolo et al. Reference Randrianasolo, Jourdan, Ravoniarimbinina, Ramarokoto, Rakotomanana, Ravaoalimalala, Gundersen, Feldmeier, Vennervald, van Lieshout, Roald, Leutscher and Kjetland2015). The resulting symptoms have a significant overlap with the symptoms of sexually transmitted infections (STIs) and other sexual and reproductive health (SRH) conditions. These symptoms include bleeding and abnormal discharge, pain during intercourse, abdominal pain, infertility and subfertility (Kjetland et al. Reference Kjetland, Kurewa, Ndhlovu, Midzi, Gwanzura, Mason, Gomo, Sandvik, Mduluza, Friis and Gundersen2008, Reference Kjetland, Kurewa, Mduluza, Midzi, Gomo, Friis, Gundersen and Ndhlovu2010; Hegertun et al. Reference Hegertun, Sulheim Gundersen, Kleppa, Zulu, Gundersen, Taylor, Kvalsvig and Kjetland2013; Bustinduy et al. Reference Bustinduy, Randriansolo, Sturt, Kayuni, Leustcher, Webster, Van Lieshout, Stothard, Feldmeier and Gyapong2022).

Diagnosing FGS in resource-constrained endemic settings is challenging, primarily due to the need for expensive, highly centralized equipment and extensive training (Bustinduy et al. Reference Bustinduy, Randriansolo, Sturt, Kayuni, Leustcher, Webster, Van Lieshout, Stothard, Feldmeier and Gyapong2022; Lamberti et al. Reference Lamberti, Bozzani, Kiyoshi and Bustinduy2024a). FGS can be diagnosed through visual examination, molecular testing for Schistosoma DNA or by histopathology. Nucleic acid amplification tests, such as polymerase chain reaction (PCR) on genital samples provide a sensitive and specific test for FGS; however, the test processing time and the equipment needed means PCR is not suitable as a point-of-care test (Kjetland et al. Reference Kjetland, Hove, Gomo, Midzi, Gwanzura, Mason, Friis, Verweij, Gundersen, Ndhlovu, Mduluza and Van Lieshout2009; Sturt et al. Reference Sturt, Webb, Phiri, Mweene, Chola, Van Dam, Corstjens, Wessels, Stothard, Hayes, Ayles, Hansingo, Van Lieshout and Bustinduy2020). Alternatively, isothermal molecular diagnostic tests such as loop-mediated isothermal application and recombinase polymerase amplification for FGS on genital samples offer a more field-friendly diagnostic test due to producing results faster with less equipment needed (Archer et al. Reference Archer, Barksby, Pennance, Rostron, Bakar, Knopp, Allan, Kabole, Ali, Ame, Rollinson and Webster2020; Van Bergen et al. Reference Van Bergen, Brienen, Randrianasolo, Ramarokoto, Leutscher, Kjetland, Van Diepen, Dekker, Saggiomo, Velders and Van Lieshout2024). Histopathological examinations of cervical biopsies and circulating anodic antigen (CAA) tests are also available; however, a biopsy will only detect eggs if taken from a cervical site where the eggs have been deposited, and CAA will only indicate the presence and burden of live worms (Hoekstra et al. Reference Hoekstra, van Dam and van Lieshout2021; Nemungadi et al. Reference Nemungadi, Kleppa, van Dam, Corstjens, Galappaththi-Arachchige, Pillay, Gundersen, Vennervald, Ndhlovu, Taylor, Naidoo and Kjetland2022). Serology, to detect antibodies specific to schistosomes, and urine microscopy, to detect eggs, can be used to diagnose schistosomiasis broadly but are not definitive for FGS, as they do not necessarily confirm genital involvement (Galappaththi-Arachchige et al. Reference Galappaththi-Arachchige, Holmen, Koukounari, Kleppa, Pillay, Sebitloane, Ndhlovu, van Lieshout, Vennervald, Gundersen, Taylor and Kjetland2018).

There is evidence that chronic FGS may persist after the active infection has been cleared, presenting diagnostic challenges that laboratory-based tests are not currently capable of meeting but that visual diagnostics may be well suited to overcome. While molecular and histopathological tests can be highly sensitive and specific for detecting schistosome genetic material or live worms, they may not be reliable for detecting and characterizing the chronic changes following treatment and infection clearance (Kjetland et al. Reference Kjetland, Mduluza, Ndhlovu, Gomo, Gwanzura, Midzi, Mason, Friis and Gundersen2006; Downs et al. Reference Downs, Mitchell, Fitzgerald, Simplice, Johnson, Bang, Mguta, Kalluvya, Kaatano and Changalucha2011). There are indications that this chronic disease state may be more prevalent in older women (Kjetland et al. Reference Kjetland, Hove, Gomo, Midzi, Gwanzura, Mason, Friis, Verweij, Gundersen, Ndhlovu, Mduluza and Van Lieshout2009; Bustinduy et al. Reference Bustinduy, Randriansolo, Sturt, Kayuni, Leustcher, Webster, Van Lieshout, Stothard, Feldmeier and Gyapong2022). Multiple studies have reported the pattern of younger women having higher rates of schistosome genetic material retrieval from the genital tract, while older women are more likely to present with visually detectable lesions (Kjetland et al. Reference Kjetland, Hove, Gomo, Midzi, Gwanzura, Mason, Friis, Verweij, Gundersen, Ndhlovu, Mduluza and Van Lieshout2009; Sturt et al. Reference Sturt, Webb, Phiri, Mweene, Chola, Van Dam, Corstjens, Wessels, Stothard, Hayes, Ayles, Hansingo, Van Lieshout and Bustinduy2020; Lamberti et al. Reference Lamberti, Kayuni, Kumwenda, Ngwira, Singh, Moktali, Dhanani, Wessels, Van Lieshout, Fleming, Mzilahowa and Bustinduy2024b). While there are strong indications of both an active and a chronic stage of FGS disease, no standardized definition of these stages currently exists. The presence of visually identified genital lesions in the absence of detectable active adult worm pairs or schistosome genetic material could be used to indicate a chronic stage of the disease. This chronic stage is characterized by progressive fibrosis and is the consequence of ongoing local inflammatory damage and granuloma formation around the trapped eggs. For patients in this stage, in lieu of a gold standard molecular test, visual diagnostics may remain a necessary diagnostic method (Bustinduy et al. Reference Bustinduy, Randriansolo, Sturt, Kayuni, Leustcher, Webster, Van Lieshout, Stothard, Feldmeier and Gyapong2022).

The current landscape of visual diagnostics

The existing visual FGS diagnostic criteria, developed around 2010 and described in the World Health Organization FGS Pocket Atlas (2015), involve the visual identification of one or more of 4 types of lesions: grainy sandy patches, homogenous sandy patches, rubbery papules and abnormal vessels (Figure 1; Jourdan et al. Reference Jourdan, Randrianasolo, Feldmeier, Chitsulo, Ravoniarimbinina, Roald and Kjetland2013; Kjetland et al. Reference Kjetland, Norseth, Taylor, Lillebø, Kleppa, Holmen, Andebirhan, Yohannes, Gundersen, Vennervald, Bagratee, Onsrud and Leutscher2014; Norseth et al. Reference Norseth, Ndhlovu, Kleppa, Randrianasolo, Jourdan, Roald, Holmen, Gundersen, Bagratee, Onsrud and Kjetland2014; Randrianasolo et al. Reference Randrianasolo, Jourdan, Ravoniarimbinina, Ramarokoto, Rakotomanana, Ravaoalimalala, Gundersen, Feldmeier, Vennervald, van Lieshout, Roald, Leutscher and Kjetland2015). While this is currently considered sufficient visual criteria for diagnosis, the lesions are often difficult to definitively identify and may not be highly prevalent in the cervical and vaginal tissue of positive cases. The characteristic lesions of FGS on the cervix can then resemble both normal variations seen in healthy cervical tissue, and various forms of non-FGS altered cervical morphology. To further increase the difficulty in identifying these lesions, the appearance of a healthy and disease-free cervix can vary significantly depending on medical and demographic factors such as a person’s age, reproductive history and sexual history (Prendiville and Sankaranarayanan, Reference Prendiville and Sankaranarayanan2017).

Figure 1. The four classic female genital schistosomiasis lesion types: Grainy sandy patches, homogenous sandy patches, abnormal vessels and rubbery papules. Images taken from the WHO FGS Pocket Atlas, 2015. The WHO FGS Pocket Atlas is licensed under CC BY-NC-SA 3.0.

Visual diagnostics are typically performed with the aid of a colposcope, which is essentially a low-powered microscope with a high-powered light source (Bustinduy et al. Reference Bustinduy, Randriansolo, Sturt, Kayuni, Leustcher, Webster, Van Lieshout, Stothard, Feldmeier and Gyapong2022). Traditional freestanding colposcopes require stable electricity sources and are expensive pieces of equipment, costing around USD $20 000 depending on the model (Lamberti et al. Reference Lamberti, Bozzani, Kiyoshi and Bustinduy2024a). Handheld colposcopes are an alternative option. They are cheaper (around USD $4000) and are battery powered, so they can be charged to be used remotely in areas with unstable electrical infrastructure (Søfteland et al. Reference Søfteland, Sebitloane, Taylor, Roald, Holmen, Galappaththi‐Arachchige, Gundersen and Kjetland2021; Sturt et al. Reference Sturt, Bristowe, Webb, Hansingo, Phiri, Mudenda, Mapani, Mweene, Levecke, Cools, Dam, Corstjens, Ayles, Hayes, Francis, Lieshout, Vwalika, Kjetland and Bustinduy2023; Lamberti et al. Reference Lamberti, Bozzani, Kiyoshi and Bustinduy2024a). Both types of colposcope require extensive training to operate and the price of handheld devices can still be unaffordable (Xue et al. Reference Xue, Ng and Qiao2020; Bustinduy et al. Reference Bustinduy, Randriansolo, Sturt, Kayuni, Leustcher, Webster, Van Lieshout, Stothard, Feldmeier and Gyapong2022). As a result, they are not widely available for use in FGS endemic settings, particularly outside of urban areas (Fokom Domgue et al. Reference Fokom Domgue, Dille, Gnangnon, Kapambwe, Bouchard, Mbatani, Gauroy, Ambounda, Yu, Sidibe, Kamgno, Traore, Tebeu, Halle-Ekane, Diomande, Dangou, Lecuru, Adewole, Plante, Basu and Shete2024). Other handheld devices, such as smartphones and digital cameras have been investigated for use in FGS diagnosis, but may not be useful without enhancement due to lower magnification capabilities and weaker light sources (Søfteland et al. Reference Søfteland, Sebitloane, Taylor, Roald, Holmen, Galappaththi‐Arachchige, Gundersen and Kjetland2021).

There are several important limitations to FGS visual diagnostics, such as the equipment costs and training needs. The fundamental limitation, however, is that visual diagnostics for FGS are highly subjective and lack specificity due to significant visual heterogeneity, as evidenced by the ‘slight’ agreement (Cohen’s kappa = 0.16) between trained expert reviewers (Sturt et al. Reference Sturt, Bristowe, Webb, Hansingo, Phiri, Mudenda, Mapani, Mweene, Levecke, Cools, Dam, Corstjens, Ayles, Hayes, Francis, Lieshout, Vwalika, Kjetland and Bustinduy2023). There are also no internationally agreed clinical guidelines or standard operating procedures to guide the systematic screening, identification, grading and recording of the characteristics of FGS lesions. However, for women who continue to suffer from chronic lesions, or are in areas without access to other testing methods, visual diagnostics may represent the only opportunity for diagnosis and better management of their disease. There is, therefore, still a need for visual diagnostics for FGS and further efforts are required to reduce visual subjectivity, enhance standardization, and improve overall reliability and reproducibility.

Why should computer vision be applied to FGS visual diagnostics?

One potential solution for the task of improving visual diagnostics is the application of computer vision, a type of artificial intelligence (AI) (Lindroth et al. Reference Lindroth, Nalaie, Raghu, Ayala, Busch, Bhattacharyya, Moreno Franco, Diedrich, Pickering and Herasevich2024). Computer vision has been applied to various medical imaging modalities (chest X-ray, magnetic resonances imaging, computed tomography and ultrasound) in many clinical contexts such as dermatology, neurology, pulmonology and ophthalmology (Elyan et al. Reference Elyan, Vuttipittayamongkol, Johnston, Martin, McPherson, Moreno-García, Jayne and Mostafa Kamal Sarker2022). FGS computer vision models would use mathematical representations of digital images to enable computers to ‘look at’ an image and then detect or classify FGS lesions.

Computer vision encompasses supervised, unsupervised or semi-supervised learning approaches, each of which can be achieved with a wide range of model designs, known as the ‘model architecture’ (Table 1).

Table 1. Common methods, use cases and architecture examples for computer vision

Supervised computer vision, currently the most common computer vision learning approach, requires that the model is trained on images that have associated labels which provide information on the true state of an image (Bishop Reference Bishop2006; Esteva et al. Reference Esteva, Chou, Yeung, Naik, Madani, Mottaghi, Liu, Topol, Dean and Socher2021; Spathis et al. Reference Spathis, Perez-Pozuelo, Marques-Fernandez and Mascolo2022). The majority of supervised medical computer vision relies on convolutional neural networks (CNNs), which process the image step-by-step using multiple stages or ‘layers’ (LeCun et al. Reference LeCun, Boser, Denker, Henderson, Howard, Hubbard and Jackel1989; Esteva et al. Reference Esteva, Chou, Yeung, Naik, Madani, Mottaghi, Liu, Topol, Dean and Socher2021). A CNN typically starts with convolutional layers to scan and detect patterns, then pooling layers to reduce the image size while preserving important features, before moving on to flattening and fully connected layers to process the information and produce a final prediction. This process is designed to detect and learn patterns such as textures, edges, shapes and other visual features within the images and is used to give outputs such as disease state classification or lesion segmentation and detection (LeCun et al. Reference LeCun, Boser, Denker, Henderson, Howard, Hubbard and Jackel1989; Takahashi et al. Reference Takahashi, Sakaguchi, Kouno, Takasawa, Ishizu, Akagi, Aoyama, Teraya, Bolatkan, Shinkai, Machino, Kobayashi, Asada, Komatsu, Kaneko, Sugiyama and Hamamoto2024). There are several common CNN architectures. ResNet models are commonly used for image classification and leverage residual connections that allow the output to skip one or more layers. This means that deep networks (50+ layers) can be built without model performance decreasing as the images flow through the high number of layers (He et al. Reference He, Zhang, Ren and Sun2015), U-Net models, often used in image segmentation tasks, use an encoder-decoder framework to first ‘encode’ the image by reducing the size while keeping the important features then ‘decode’ the compressed image by gradually upscaling the important features to reconstruct the output (Ronneberger et al. Reference Ronneberger, Fischer and Box2015). Another common CNN architecture are the YOLO (You Only Look Once) models, commonly used for object detection (Wang et al. Reference Wang, Bochkovskiy and Liao2022). Vision transformers (ViTs) are another, comparatively newer, form of supervised computer vision, which do not scan across the images like CNNs but instead break down the images into smaller pieces to analyse the relationship between them all simultaneously (Dosovitskiy et al. Reference Dosovitskiy, Beyer, Kolesnikov, Weissenborn, Zhai, Unterthiner, Dehghani, Minderer, Heigold, Gelly, Uszkoreit and Houlsby2020). Both CNNs and ViTs have their strengths and each has been found to outperform the other in various tasks; however, there does not appear to be a consistent pattern explaining why one architecture works better than the other in specific cases (Takahashi et al. Reference Takahashi, Sakaguchi, Kouno, Takasawa, Ishizu, Akagi, Aoyama, Teraya, Bolatkan, Shinkai, Machino, Kobayashi, Asada, Komatsu, Kaneko, Sugiyama and Hamamoto2024).

In unsupervised learning, the model is not provided with the associated labels and therefore learns about the images by looking for the inherent structures that exist within them (Bengio et al. Reference Bengio, Courville and Vincent2013). The resultant unsupervised model would then classify images into groups that could then be labelled post hoc by the investigator. A generative adversarial network (GAN) is an example of a model that can be trained with an unsupervised approach (Goodfellow et al. Reference Goodfellow, Pouget-Abadie, Mirza, Xu, Warde-Farley, Ozair, Courville and Bengio2014). A GAN works based on game theory by having 2 networks, a generator and a discriminator, compete with one another. The generator attempts to create images that are as photorealistic as possible and to trick the discriminator, which is attempting to spot the difference between synthetic and genuine images. Both networks become better at doing their jobs (the adversarial training process) until a selected dataset of realistic images is created (Goodfellow et al. Reference Goodfellow, Pouget-Abadie, Mirza, Xu, Warde-Farley, Ozair, Courville and Bengio2014). Another example of unsupervised learning is auto-encoder anomaly detection (Hinton and Zemel, Reference Hinton and Zemel1993; Neloy and Turgeon, Reference Neloy and Turgeon2024). In this architecture, the auto-encoder can be trained only on ‘normal’ (i.e. healthy) images to learn how to accurately recreate these images. When it is then presented with unseen images, the model will attempt to reconstruct these and measure the difference between the original image and the reconstructed image (the reconstruction error) with the theory that there will be a higher reconstruction error in the images with the anomalies compared to the ones that closely resemble the training images (Stepec and Skocaj, Reference Stepec and Skocaj2021).

Semi-supervised learning is also possible where a model is given a small set of annotated images to inform and guide the task (i.e. classification), along with a larger set of unannotated images from which it learns the inherent structure (Van Engelen and Hoos, Reference Van Engelen and Hoos2020). This can be beneficial when labelled images are scarce. Many models will also use more than one architecture type in a larger pipeline. Yang et al. designed an algorithm for cervical cancer classification that was made up of a pipeline of a multi-modal GAN, U-Net and an auto-encoder (Yang et al. Reference Yang, Aydi, Innab, Ghoneim and Ferrara2024). The choice of computer vision model architecture is usually a product of the desired output, the complexity of the task, the computational resources available and, importantly, the (annotated) data available (Elyan et al. Reference Elyan, Vuttipittayamongkol, Johnston, Martin, McPherson, Moreno-García, Jayne and Mostafa Kamal Sarker2022; Huang et al. Reference Huang, Pareek, Jensen, Lungren, Yeung and Chaudhari2023). Generally, multiple different architectures are tested for each task, and each model architecture comes with trade-offs. For example, the more complex models often require much higher levels of computational power or are prone to instability, while simpler models may underperform or struggle to handle complex data or tasks effectively (Van Engelen and Hoos, Reference Van Engelen and Hoos2020; Esteva et al. Reference Esteva, Chou, Yeung, Naik, Madani, Mottaghi, Liu, Topol, Dean and Socher2021; Vargas‐Cardona et al. Reference Vargas‐Cardona, Rodriguez‐Lopez, Arrivillaga, Vergara‐Sanchez, García‐Cifuentes, Bermúdez and Jaramillo‐Botero2023).

The application of computer vision to the visual signs of FGS on the cervix and surrounding tissue may be a solution to many of the challenges of FGS visual diagnostics (Table 2). Several computer vision core architectures, such as ResNET50 (Liu et al. Reference Liu, Wang, Liu, Han, Jia, Meng, Yang, Chen, Zhang and Qiao2021) and Faster R-CNN (Hu et al. Reference Hu, Bell, Antani, Xue, Yu, Horning, Gachuhi, Wilson, Jaiswal, Befano, Long, Herrero, Einstein, Burk, Demarco, Gage, Rodriguez, Wentzensen and Schiffman2019), have already been trained using cervical images in the context of cervical cancer. A review of cervical cancer algorithms and their applicability to FGS was conducted in 2023 and found 13 algorithms that were ‘relevant for FGS diagnosis’; however, none of the 13 algorithms had open-source code and could not be immediately fine-tuned for FGS images (Jin et al. Reference Jin, Noble and Gomes2023). Cervical cancer computer vision algorithms are already in the process of being validated, after extensive research, model training and fine-tuning. The HPV-automated visual evaluation (PAVE) study is already underway to validate a screen-triage-treat approach with the support of a computer vision model (de Sanjosé et al. Reference de Sanjosé, Perkins, Campos, Inturrisi, Egemen, Befano, Rodriguez, Jerónimo, Cheung, Desai, Han, Novetsky, Ukwuani, Marcus, Ahmed, Wentzensen, Kalpathy-Cramer and Schiffman2024). FGS has also been listed as a potential confounder to some cervical cancer computer vision models (Desai et al. Reference Desai, Befano, Xue, Kelly, Campos, Egemen, Gage, Rodriguez, Sahasrabuddhe, Levitz, Pearlman, Jeronimo, Antani, Schiffman and de Sanjosé2022).

Table 2. The barriers to visual diagnostics for female genital schistosomiasis (FGS) and the potential computer vision-based solutions

Training a computer vision model requires significant technical expertise, along with access to high-powered computers fitted with graphics processing units that facilitate the computationally intensive process. Using the model once it is trained is simpler, often only requiring the type of central processing units found in typical personal computers and smart devices (Pietrołaj and Blok Reference Pietrołaj and Blok2024). This means that it is possible to use clinical computer vision tools in resource-constrained settings, either directly on a smart device/standard computer, or via the cloud if reliable internet services are available. There is also the option to integrate computer vision tools directly within a colposcope for immediate diagnostic feedback and support for practitioners while performing live examinations.

Integrating computer vision into FGS diagnostics may have the potential to remove several of the barriers that are currently in place (Table 2). It may have the potential to alleviate some of the cost associated with colposcopy through supporting the use of other photographic equipment (e.g. smartphones) and therefore also improving access in remote areas. As a clinical support tool, computer vision could reduce the amount of high-level specialist training required. Further, computer vision may reduce the subjectivity and biases of even the most highly trained and experienced clinical grader.

How could computer vision be used for FGS?

There are many potential use-cases for FGS computer vision. FGS models could be trained to detect lesions when presented with images from a single patient for diagnostic purposes. The computer vision could sit within a wider pipeline of diagnostic tools, with the pathway designed to be cost-effective and reflective of the natural progression of acute to chronic disease. Models could also be fed batches of images to screen and then triage patients as appropriate. There are notable parallels between cervical cancer and FGS, and cervical cancer screening programmes offer a promising point of integration for FGS computer vision tools. Another promising application of an FGS computer vision tool is that it could become an online diagnostic tool for those in previously non-endemic settings without specific FGS testing infrastructure, as migration continues and the geographical distribution of Schistosoma haematobium continues to expand (Lingscheid et al. Reference Lingscheid, Kurth, Clerinx, Marocco, Trevino, Schunk, Muñoz, Gjørup, Jelinek, Develoux, Fry, Jänisch, Schmid, Bouchaud, Puente, Zammarchi, Mørch, Björkman, Siikamäki, Neumayr, Nielsen, Hellgren, Paul, Calleri, Kosina, Myrvang, Ramos, Just-Nübling, Beltrame, Saraiva Da Cunha, Kern, Rochat, Stich, Pongratz, Grobusch, Suttorp, Witzenrath, Hatz and Zoller2017; Marchese et al. Reference Marchese, Beltrame, Angheben, Monteiro, Giorli, Perandin, Buonfrate and Bisoffi2018; Salas-Coronas et al. Reference Salas-Coronas, Vázquez-Villegas, Lozano-Serrano, Soriano-Pérez, Cabeza-Barrera, Cabezas-Fernández, Villarejo-Ordóñez, Sánchez-Sánchez, Abad Vivas-Pérez, Vázquez-Blanc, Palanca-Giménez and Cuenca-Gómez2020). Prognostic uses for FGS computer vision are unlikely to be useful without further research into the relationship between visual signs, symptoms and associated morbidity (Randrianasolo et al. Reference Randrianasolo, Jourdan, Ravoniarimbinina, Ramarokoto, Rakotomanana, Ravaoalimalala, Gundersen, Feldmeier, Vennervald, van Lieshout, Roald, Leutscher and Kjetland2015).

For now, FGS computer vision is likely to support only specific subpopulations and function at certain points of the diagnostic pathway. Computer vision tools that have been clinically validated still require the involvement of trained clinicians, rather than serving as a stand-alone replacement for clinicians or other diagnostic tools. For example, a 2021 systematic review of the use of AI for breast cancer detection found that none of the 12 studies (n = 131 882 women screened) provided sufficient evidence to support the use of computer vision as a stand-alone replacement for radiologists or triage systems (Freeman et al. Reference Freeman, Geppert, Stinton, Todkill, Johnson, Clarke and Taylor-Phillips2021). Since then, the Mammography Screening with Artificial Intelligence randomized control trial of 105 934 women used computer vision to triage patients into a single or double reading by clinicians (Hernström et al. Reference Hernström, Josefsson, Sartor, Schmidt, Larsson, Hofvind, Andersson, Rosso, Hagberg and Lång2025). The results demonstrated that by using computer vision as a support tool, rather than a complete diagnostic replacement, there was an overall increase in cancer detection of 29% (95% CI: 1·09–1·15, P=0·0021) and a 44·2% reduction in the radiologist screening workload (Hernström et al. Reference Hernström, Josefsson, Sartor, Schmidt, Larsson, Hofvind, Andersson, Rosso, Hagberg and Lång2025). As another example, the PAVE study uses a risk stratification and risk-based management approach by coupling computer vision and HPV genotyping in the diagnostic pathway following a positive HPV diagnosis, rather than using solely computer vision as a stand-alone replacement (de Sanjosé et al. Reference de Sanjosé, Perkins, Campos, Inturrisi, Egemen, Befano, Rodriguez, Jerónimo, Cheung, Desai, Han, Novetsky, Ukwuani, Marcus, Ahmed, Wentzensen, Kalpathy-Cramer and Schiffman2024).

A proposed diagnostic pathway is presented in Figure 2 as an example of where computer vision may fit within the wider FGS diagnostic and screening pipeline. The endemic setting pathway is based on the hypothesis that older women are more likely to test positive with visual diagnostics, based on the results of molecular versus visual diagnostics from multiple studies (Kjetland et al. Reference Kjetland, Hove, Gomo, Midzi, Gwanzura, Mason, Friis, Verweij, Gundersen, Ndhlovu, Mduluza and Van Lieshout2009; Sturt et al. Reference Sturt, Webb, Phiri, Mweene, Chola, Van Dam, Corstjens, Wessels, Stothard, Hayes, Ayles, Hansingo, Van Lieshout and Bustinduy2020; Lamberti et al. Reference Lamberti, Kayuni, Kumwenda, Ngwira, Singh, Moktali, Dhanani, Wessels, Van Lieshout, Fleming, Mzilahowa and Bustinduy2024b). As such, a sensible and cost-effective approach to the diagnostic pathway would be to start with the test most likely to be positive. Further work is needed in this area to refine the age parameters and confirm the validity of the testing pipeline.

**Urine microscopy and serology do not confirm genital involvement.***Cervical cancer screening age varies between countries.†Cervical cancer screening diagnostic algorithms vary between countries. FGS computer vision can be included in colposcopy portion of the algorithm.

Figure 2. A hypothesized pathway of the potential use-cases for computer vision supported visual diagnostics within the wider FGS diagnostic pathway. Abbreviations: FGS, female genital schistosomiasis; SRH, sexual reproductive health; CAA, circulating anodic antigen, PCR, polymerase chain reaction.

What are the main challenges to using computer vision for female genital schistosomiasis?

Visual heterogeneity and ground truth annotation

Defining the ‘ground truth’ is one of the fundamental steps in developing a computer vision model for FGS. A ground truth is essentially the reference standard of the model and represents the ‘true state’ of what the model is trying to identify (Shen et al. Reference Shen, Wu and Suk2017; Sepehri et al. Reference Sepehri, Song, Proulx, Hajra, Dobberthien, Liu, D’Arcy, Murray and Krauze2021). In a supervised computer vision model for FGS, the ground truth determines how an image should be annotated (labelled). Without a well-defined ground truth, the line between FGS-positive and FGS-negative cases becomes blurred and incorrect classifications are built into the model (Egemen et al. Reference Egemen, Perkins, Cheung, Befano, Rodriguez, Desai, Lemay, Ahmed, Antani, Jeronimo, Wentzensen, Kalpathy-Cramer, De Sanjose and Schiffman2024). An accurate ground truth is also important for the validation of unsupervised and semi-supervised models.

The ground truth annotations on FGS images can be done at different scales (Figure 3). Ideally, images need to be annotated through ‘object detection’ annotation, meaning that each lesion identified is captured within a bounding box or polygon, which are then labelled (e.g. as a homogenous sandy patch; Figure 3, panel C). Higher granularity annotations, such as object detection, enable more precise model training by including only relevant lesions while excluding extraneous pixels (Ilyas et al. Reference Ilyas, Ahmad, Arsa, Jeong and Kim2024). To the best of our knowledge, true object detection annotation has never been carried out on FGS images.

Figure 3. Different scales of colposcope image labelling from lowest to highest granularity. (A) binary classification (lesion present or absent) per image, (B) quadrant classification (lesion present or absent) per cervical quadrant, (C) multiclass classification, allowing for multiple features to be labelled on a single image, and for lesion size, relative location and other characteristics to be estimated. Panel C labelled using CVAT labelling software (CVAT.Ai, Palo Alto, USA).

Images for FGS computer vision are currently being annotated by expert clinicians. However, this can be a laborious task, with very few people in the world both qualified and available to dedicate the time to review these images. The lack of a well-defined and easily identifiable ground truth, globally agreed grading system, associated protocols and data standards for documenting FGS lesions makes this task more daunting still. As a result, most images only have a binary classification annotation (Figure 3, panel A), with some being annotated at a cervical quadrant level (Figure 3, panel B).

Currently, the only possible guide for ground truth annotations for visual FGS diagnosis is that of the appearance of one of the four classic lesion types. However, accurately annotating if and where these lesions are present is difficult and highly subjective. For many other diseases, with similarly subjective or heterogeneous visual presentations, a suitable proxy or confirmatory molecular test is available to assist with ground truth definition. For example, computer vision-enabled screening tools for tuberculosis have been clinically validated and are already being used in places like Nigeria (Babayi et al. Reference Babayi, Odume, Ogbudebe, Chukwuogo, Nwokoye, Dim, Useni, Nongo, Eneogu, Chijioke-Akaniro and Anyaike2023). In this case, the model training and ground truth annotation was often being supported by bacterial culture and PCR test results (Babayi et al. Reference Babayi, Odume, Ogbudebe, Chukwuogo, Nwokoye, Dim, Useni, Nongo, Eneogu, Chijioke-Akaniro and Anyaike2023; Hansun et al. Reference Hansun, Argha, Liaw, Celler and Marks2023; Scott et al. Reference Scott, Perumal, Pooran, Oelofse, Jaumdally, Swanepoel, Gina, Mthiyane, Qin, Fehr, Grant, Wong, Van Der Walt, Esmail and Dheda2025). A suitable proxy or confirmatory test for FGS ground truth annotation is not always possible as laboratory tests can either be unsuitable or have low sensitivity/specificity for chronic FGS lesions that can persist after the infection has been cleared (Hoekstra et al. Reference Hoekstra, van Dam and van Lieshout2021; Nemungadi et al. Reference Nemungadi, Kleppa, van Dam, Corstjens, Galappaththi-Arachchige, Pillay, Gundersen, Vennervald, Ndhlovu, Taylor, Naidoo and Kjetland2022; Sturt et al. Reference Sturt, Webb, Phiri, Mapani, Mudenda, Himschoot, Kjetland, Mweene, Levecke, van Dam, Corstjens, Ayles, Hayes, Francis, van Lieshout, Cools, Hansingo and Bustinduy2022; Lamberti et al. Reference Lamberti, Bozzani, Kiyoshi and Bustinduy2024a). If a model is only trained on images with associated confirmatory laboratory testing, then there is a risk of (A) biasing the dataset to only those who have an active infection and (B) significantly decreasing the number of images that can be fed into the model, as not all images have associated test data.

Symptom-based confirmation of FGS, where the ground truth would be confirmed with the presence of certain symptoms, is also difficult, as there is very little consistent evidence of the association between lesion and symptom presentation (Kjetland et al. Reference Kjetland, Leutscher and Ndhlovu2012; Lamberti et al. Reference Lamberti, Kayuni, Kumwenda, Ngwira, Singh, Moktali, Dhanani, Wessels, Van Lieshout, Fleming, Mzilahowa and Bustinduy2024b). The overlap between the symptoms and endemic regions of STIs and FGS further confounds the ability contribute a symptom to either a specific FGS lesion presentation or FGS in general (Poggensee et al. Reference Poggensee, Kiwelu, Weger, Göppner, Diedrich, Krantz and Feldmeier2000; Leutscher et al. Reference Leutscher, Ramarokoto, Hoffmann, Jensen, Ramaniraka, Randrianasolo, Raharisolo, Migliani and Christensen2008; Sturt et al. Reference Sturt, Webb, Himschoot, Phiri, Mapani, Mudenda, Kjetland, Mweene, Levecke, Van Dam, Corstjens, Ayles, Hayes, Van Lieshout, Hansingo, Francis, Cools and Bustinduy2021). A Tanzanian study of 347 women found that symptom-based diagnosis of FGS had a specificity of only 15% (95% CI: 9·7–20·3%; Mbwanji et al. Reference Mbwanji, Ndaboine, Yusuf, Kabona, Marwa and Mazigo2024). However, colposcopy was used as the diagnostic standard for FGS in this study, which itself lacks specificity (Sturt et al. Reference Sturt, Bristowe, Webb, Hansingo, Phiri, Mudenda, Mapani, Mweene, Levecke, Cools, Dam, Corstjens, Ayles, Hayes, Francis, Lieshout, Vwalika, Kjetland and Bustinduy2023), and urine microscopy was used as a comparison and supporting diagnosis, despite the imprecise relationship between FGS and S. haematobium eggs in the urine (Christinet et al. Reference Christinet, Lazdins-Helds, Stothard and Reinhard-Rupp2016; Rafferty et al. Reference Rafferty, Sturt, Phiri, Webb, Mudenda, Mapani, Corstjens, van Dam, Schaap, Ayles, Hayes, van Lieshout, Hansingo and Bustinduy2021).

Small and homogenous datasets

The downstream impact of issues with ground truth definition and image annotation are the comparatively small number of annotated images available to train FGS computer vision models. While some computer vision models for other diseases have been trained on hundreds of thousands or millions of images, there are currently only tens of thousands of images worldwide for FGS, and a centralized database is yet to be created. To create a computer vision model, image datasets must be broken into training, validation, and testing subsets, and blurred or obstructed images cannot be used in many training methods. This further decreases the number of ‘useful’ images that exist for the purposes of model training and testing.

The images that do exist are homogeneous because the only sources of FGS image datasets are field studies from a small number of countries, as no country currently has a screening or diagnostic programme (Ndubani et al. Reference Ndubani, Lamberti, Kildemoes, Hoekstra, Fitzpatrick, Kelly, Vwalika, Randrianasolo, Sturt, Kayuni, Choko, Kasese, Kjetland, Nemungadi, Mocumbi, Samson, Ntapara, Thomson, Danstan, Chikwari, Martin, Rabiu, Terkie, Chaima, Kasoka, Joeker, Arenholt, Leutscher, Stothard, Rabozakandria, Gouvras, Munthali, Hameja, Kanfwa, Hikabasa, Ayles, Shanaube and Bustinduy2024). This can mean that the model is not sufficiently trained to handle images from other countries with different confounding diseases, sociodemographic variation or clinicians with their own imaging techniques and protocols. The lack of heterogeneity in location, photographic equipment and clinical staff can become an issue, as computer vision models can become overly specialized, performing well only on images captured in the same way and place as the images used in training (Zech et al. Reference Zech, Badgeley, Liu, Costa, Titano and Oermann2018). This means that models can exhibit a high degree of internal validity but poor external validity (Ting et al. Reference Ting, Cheung, Lim, Tan, Quang, Gan, Hamzah, Garcia-Franco, San Yeo, Lee, Wong, Sabanayagam, Baskaran, Ibrahim, Tan, Finkelstein, Lamoureux, Wong, Bressler, Sivaprasad, Varma, Jonas, He, Cheng, Cheung, Aung, Hsu, Lee and Wong2017; Zech et al. Reference Zech, Badgeley, Liu, Costa, Titano and Oermann2018). A key example of this was a model being trained to detect pneumonia in chest X-rays that was also able to predict, with 100% accuracy, whether the image was taken with the inpatient portable X-ray machine or the emergency department X-ray machine (Zech et al. Reference Zech, Badgeley, Liu, Costa, Titano and Oermann2018).

Troubleshooting the challenges in FGS computer vision

General troubleshooting

There are many potential solutions to the challenges of poor ground truth definition, image annotation, small datasets and generalizability. Broadly speaking, adapting and utilizing different computer vision architectures may assist in overcoming the challenges in FGS computer vision. Unsupervised models are one option, which would remove the need to annotate images for the training of a model. These unsupervised models may also uncover commonalities in the images of positive cases that have not yet been uncovered by human eye detection (Patel Reference Patel2019). Further, unsupervised models are potentially more generalizable as they are not trained on specific labels (Huang et al. Reference Huang, Pareek, Jensen, Lungren, Yeung and Chaudhari2023).

Annotation troubleshooting

Label generating algorithms or ‘self-annotating’ models are an option when there are insufficient resources for manual image annotation (Huang et al. Reference Huang, Pareek, Jensen, Lungren, Yeung and Chaudhari2023). One such approach is self-supervised learning, where the model generates its own labels from the raw data it is given to then train itself in further rounds (Spathis et al. Reference Spathis, Perez-Pozuelo, Marques-Fernandez and Mascolo2022). In a review of 79 studies, self-supervised computer vision models increased the overall accuracy of models by up to 29% (95% CI: 0·44%, 29·2%; Huang et al. Reference Huang, Pareek, Jensen, Lungren, Yeung and Chaudhari2023). Pseudo-labelling algorithms, a common method in semi-supervised learning, learn from the labels of a small set of annotated images and the inherent structures of a larger set of unannotated images to ‘label’ the unannotated data. Another option that shows promise for automated medical image annotation is MedSAM (Medical Segment Anything Model), a model that was trained on 1 570 263 medical images. It was created to be a pixel-level ‘universal medical image segmentation’ tool that can automatically segment medical images based on the models understanding of anatomical structures it learnt during the training phase (Ma et al. Reference Ma, He, Li, Han, You and Wang2024). While colposcope images were not included in the original MedSAM training set, it could, in theory, be fine-tuned for the automated or semi-automated annotation of colposcope images.

Ground truth troubleshooting

The development of effective and high-performing computer vision tools for FGS is contingent upon the development of a defined and widely accepted ground truth. Achieving this will require the collaboration of experts and the implementation of robust, standardized protocols. Expert reviewers will undoubtedly play a pivotal role in both the development and implementation of these protocols, potentially contributing without direct compensation on an often laborious task. In the absence of coordinated expert effort, model performance is likely to be compromised by inconsistent or poorly validated reference standards.

At present, an FGS visual diagnosis is binary (positive/negative) and the way that a visual diagnosis is made is not highly standardized. The World Health Organization FGS Pocket Atlas (2015) provides broad guidance on the types of lesions to identify but currently each academic team will have slightly different visual diagnostic protocols and grading scales (if one is used at all). The development of a refined visual diagnostic and grading tool, guided by the consensus of experts, might feasibly provide guidance on a more systematized classification of FGS. This grading tool could be a combination of lesion presentation (e.g. size, colour and location), patient characteristics and associated symptoms. This tool could form the basis of a more robust and standardized ground truth definition.

Redefining the annotation classes is an option when the ground truth is poorly defined. One option is to add an ‘indeterminate’ class to create a multiclass ordinal classification, rather than a binary classification of positive or negative. This introduces some flexibility within the model to handle images that are neither obvious positive nor negatives. This multiclass ordinal classification was used by Egemen et al. for a cervical cancer computer vision with positive results (Egemen et al. Reference Egemen, Perkins, Cheung, Befano, Rodriguez, Desai, Lemay, Ahmed, Antani, Jeronimo, Wentzensen, Kalpathy-Cramer, De Sanjose and Schiffman2024). However, this option does risk classifying too many images as neither positive nor negative, which is unhelpful for diagnostic purposes. Soft labelling is another annotation approach, where instead of hard labelling (positive/negative) a probabilistic score is given to each label to represent the degree of uncertainty in the label. This has been used before in situations where the ground truth is ambiguous and some labels may be more likely to be correct than others (Ahfock and McLachlan Reference Ahfock and McLachlan2021). By having multiple trained reviewers (3 or more) read each image, the confidence in the ground truth could be increased. If multiple reviewers used soft labelling, then a mean probabilistic score could be provided for each image.

Small dataset troubleshooting

Using a GAN, or a similar image generation model, may be useful in improving the volume of images available for training. The GAN architecture has been applied to various medical image types such as retinal images, brain tumour MRIs and skin cancer (Ahmad et al. Reference Ahmad, Ali, Shah and Azmat2022). A GAN can also be used to deblur images, a method that was used in 2019 to deblur and enhance cervical images captured with smartphones (Ganesan et al. Reference Ganesan, Xue, Singh, Long, Ghoraani and Antani2019). That paper did report an increase in detection accuracy (+21·4%); however, the model was only tested on 14 biopsy confirmed abnormal images that required manual computational blurring (as they were originally in sharp focus). Still, the paper stands as a proof of concept for the future application of this method to deblur colposcope images and therefore increase the number of useable images.

Integrating other data types (i.e. clinical, diagnostic and sociodemographic data) into the model may also improve performance and support the smaller image datasets. Liu et al. included this type of data in a model created to detect cervical pre-cancer based on colposcope images from patients in Shandong Province, China (Liu et al. Reference Liu, Wang, Liu, Han, Jia, Meng, Yang, Chen, Zhang and Qiao2021). This did not improve model accuracy overall but may suggest a basis for further investigation in FGS computer vision. The inclusion of this data is made possible due to the rich data sets that are being collected in field studies alongside colposcope images. By integrating and modelling other information on patients such as age, sociodemographic factors and STI status, the computer vision model may become more sensitive and specific in detecting FGS.

Generalizability troubleshooting

The few sources of new images and proprietary ownership over these images make robust testing of the generalizability of computer vision models difficult. The generalizability of a computer vision model relates to the ability of a model to perform well on never-before-seen images, particularly images that vary from the training set based on parameters such as photographic equipment, imaging methods and geographical location. A centralized image source that showcases the photographic and geographic variation that should be included within FGS models would be highly beneficial. This would increase the generalizability of the models developed so that they could be applied to images taken from various locations and with various colposcope and imaging equipment. The benefit of doing this is highlighted by Ekem et al. in their development of a colposcope image deblurring algorithm (Ekem et al. Reference Ekem, Skerrett, Huchko and Ramanujam2025). The model was trained using images from 2 different handheld colposcope models and one freestanding colposcope from patients across 6 countries (India, Zambia, Honduras, USA, Peru and Tanzania), reflecting different sociodemographic factors and co-endemicities. The model was then tested and validated on images from Kenya and a holdout set (a portion of images set aside from the training set to later be used for testing) from all 6 countries. The model was found to be generalizable to these images (accuracy 89%), reflecting the benefit of including a variety of image sources within computer vision training sets.

Pooling images in a centralized database is likely to go a long way in improving the generalizability of these computer vision models. However, it does remain a possibility that computer vision models will need to be fine-tuned each time they are deployed in a new setting to perform at maximum capability.

Implementation

A 2024 review of computer vision in healthcare settings found that the vast majority of computer vision models intended for clinical use are still in the development and testing phase, and that reporting and documentation on the implementation of computer vision was ‘scarce’ (Lindroth et al. Reference Lindroth, Nalaie, Raghu, Ayala, Busch, Bhattacharyya, Moreno Franco, Diedrich, Pickering and Herasevich2024). Furthermore, there is currently very little information on the success of these tools on the African continent. In a review of 86 randomized control trials on the use of AI in clinical practice, only 2 were conducted in Africa (Han et al. Reference Han, Acosta, Shakeri, Ioannidis, Topol and Rajpurkar2024). So, while a focus on model development is important, it is equally as important to begin to take steps to ensure the successful implementation of the technology and to engage stakeholders to co-produce ideas about how these tools should be implemented safely, equitably and effectively. Taking these steps alongside model development will allow for a more expedited deployment.

The implementation of computer vision in resource-constrained settings, where FGS is endemic, poses a particular set of challenges. While there has been a huge increase in technological infrastructure (mobile phone availability, WIFI coverage and appliance charging capabilities) across the African continent, this infrastructure remains limited in many areas (Musa et al. Reference Musa, Haruna, Manirambona, Eshun, Ahmad, Dada, Gololo, Musa, Abdulkadir and Lucero-Prisno Iii2023). In 2021, more than 56 000 rural hospitals in sub-Saharan Africa did not have an electrical supply (Moner-Girona et al. Reference Moner-Girona, Kakoulaki, Falchetta, Weiss and Taylor2021). This, along with a paucity of financial support for the integration of these tools for neglected tropical diseases and the associated training, may make scalability difficult (The Lancet Digital 2023). Ensuring that the tools remain computationally efficient and placing them within commonly used devices (such as basic tablets or smartphones) or on the Cloud will support implementation. Keeping computer vision models open source, freely available, and compatible with multiple colposcope models and photographic devices will help keep costs down.

The ethical implications of these kinds of tools need to be considered throughout the development and implementation phases. Ethical risks include the risk to bodily autonomy and privacy, particularly through data breaches. The risk of exploitation following any data breach is also higher in this area compared to others because the highly sensitive and often stigmatizing context of SRH and FGS acts as a risk multiplier (World Health Organization 2024). The World Health Organization (WHO) has released ethical and regulatory guidance for AI and a technical brief on AI for SRH (World Health Organization, 2021, 2024). However, these broad overviews should also be accompanied by country-specific guidance that considers cultural practices, perspectives, and regulatory and legal differences (Eke et al. Reference Eke, Wakunuma and Akintoye2023).

Consistent informed discussion and exploration across development teams regarding the benefits and implementation of computer vision for FGS is essential. A meeting of those working on cervical cancer computer vision took place at the 38th International Papillomavirus Conference in November 2024 with the aim of standardizing approaches, developing validation criteria and identifying research gaps. A similar meeting of those working on FGS is likely to be highly beneficial.

A pathway forward

Computer vision holds immense promise in improving the way disease is detected. While there is promise, FGS is a disease that, perhaps more than others, presents barriers to the development and implementation of computer vision. Variation in healthy cervixes and FGS lesions, along with visually confounding diseases, makes defining a ground truth difficult. This, along with very few expert image reviewers, means that at present there are few images annotated at a highly granular level. The generalizability of these models is a significant obstacle. Despite these barriers, with the AI field growing at an exponential rate, and with collaboration between teams, these tools have the potential to be successfully used in FGS.

Author’s contributions

MEL contributed to conceptualization, draft writing, review and editing. ChR and ALB contributed to supervision, reviewing and editing.

Financial support

MEL was supported by the Medical Research Council [MR/W006677/1]. ChR was supported by the LSHTM Global Health Analytics Group’s ‘Pay What You Can’ funding scheme, a crowdsourced initiative that facilitates innovative and open methods research for the public good. ALB is supported by the UKRI Future Leaders Fellowship [MR/Z000033/1].

Competing interests

The authors declare there are no conflicts of interest.

Ethical standards

Not applicable.

References

Ahfock, D and McLachlan, GJ (2021) Harmless label noise and informative soft-labels in supervised classification. Computational Statistics and Data Analysis 161, 107253.Google Scholar

Ahmad, W, Ali, H, Shah, Z and Azmat, S (2022) A new generative adversarial network for medical images super resolution. Scientific Reports 12(1). 1–20Google Scholar

Archer, J, Barksby, R, Pennance, T, Rostron, P, Bakar, F, Knopp, S, Allan, F, Kabole, F, Ali, SM, Ame, SM, Rollinson, D and Webster, BL (2020) Analytical and clinical assessment of a portable, isothermal recombinase polymerase amplification (RPA) assay for the molecular diagnosis of urogenital schistosomiasis. Molecules 25(18), 4175.Google Scholar

Babayi, AP, Odume, BB, Ogbudebe, CL, Chukwuogo, O, Nwokoye, N, Dim, CC, Useni, S, Nongo, D, Eneogu, R, Chijioke-Akaniro, O and Anyaike, C (2023) Improving TB control: Efficiencies of case-finding interventions in Nigeria. Public Heal Action 13(3), 90–96.Google Scholar

Bengio, Y, Courville, A and Vincent, P (2013) Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence. 35, 1798–1828.Google Scholar

Bishop, CM (2006) Pattern Recognition and Machine Learning. Singapore: Springer.Google Scholar

Bustinduy, AL, Randriansolo, B, Sturt, AS, Kayuni, SA, Leustcher, PDC, Webster, BL, Van Lieshout, L, Stothard, JR, Feldmeier, H and Gyapong, M (2022) An update on female and male genital schistosomiasis and a call to integrate efforts to escalate diagnosis, treatment and awareness in endemic and non-endemic settings: The time is now. Advances in Parasitology 115, 1–44.Google Scholar

Caron, M, Misra, I, Mairal, J, Goyal, P, Bojanowski, P and Joulin, A (2020) Unsupervised Learning of Visual Features by Contrasting Cluster Assignments. 34th Conference on Neural Information Processing Systems (NeurIPS): 1–23.Google Scholar

Christinet, V, Lazdins-Helds, JK, Stothard, JR and Reinhard-Rupp, J (2016) Female genital schistosomiasis (FGS): From case reports to a call for concerted action against this neglected gynaecological disease. International Journal for Parasitology 46(7), 395–404.Google Scholar

Desai, KT, Befano, B, Xue, Z, Kelly, H, Campos, NG, Egemen, D, Gage, JC, Rodriguez, A-C, Sahasrabuddhe, V, Levitz, D, Pearlman, P, Jeronimo, J, Antani, S, Schiffman, M and de Sanjosé, S (2022) The development of “automated visual evaluation” for cervical cancer screening: The promise and challenges in adapting deep-learning for clinical testing. International Journal of Cancer 150(5), 741–752.Google Scholar

de Sanjosé, S, Perkins, RB, Campos, N, Inturrisi, F, Egemen, D, Befano, B, Rodriguez, AC, Jerónimo, J, Cheung, LC, Desai, K, Han, P, Novetsky, AP, Ukwuani, A, Marcus, J, Ahmed, SR, Wentzensen, N, Kalpathy-Cramer, J and Schiffman, M (2024) Design of the HPV-automated visual evaluation (PAVE) study: Validating a novel cervical screening strategy. eLife 12, RP91469.Google Scholar

Dosovitskiy, A, Beyer, L, Kolesnikov, A, Weissenborn, D, Zhai, X, Unterthiner, T, Dehghani, M, Minderer, M, Heigold, G, Gelly, S, Uszkoreit, J and Houlsby, N (2020) An image is worth 16x16 words: Transformers for image recognition at scale. arXiv abs/2010.11929.Google Scholar

Downs, JA, Mitchell, KB, Fitzgerald, DW, Simplice, H, Johnson, WD, Bang, H, Mguta, C, Kalluvya, SE, Kaatano, GM and Changalucha, JM (2011) Urogenital Schistosomiasis in Women of Reproductive Age in Tanzania’s Lake Victoria Region. American Journal of Tropical Medicine and Hygiene 84(3), 364–369.Google Scholar

Egemen, D, Perkins, RB, Cheung, LC, Befano, B, Rodriguez, AC, Desai, K, Lemay, A, Ahmed, SR, Antani, S, Jeronimo, J, Wentzensen, N, Kalpathy-Cramer, J, De Sanjose, S and Schiffman, M (2024) Artificial intelligence–based image analysis in clinical testing: Lessons from cervical cancer screening. Journal of the National Cancer Institute 116(1), 26–33.Google Scholar

Eke, DO, Wakunuma, K and Akintoye, S (2023) Introducing Responsible AI in Africa. 1–11.Google Scholar

Ekem, L, Skerrett, E, Huchko, MJ and Ramanujam, N (2025) Automated image clarity detection for the improvement of colposcopy imaging with multiple devices. Biomedical Signal Processing and Control 100, 106948.Google Scholar

Elyan, E, Vuttipittayamongkol, P, Johnston, P, Martin, K, McPherson, K, Moreno-García, CF, Jayne, C and Mostafa Kamal Sarker, M (2022) Computer vision and machine learning for medical image analysis: Recent advances, challenges, and way forward. Artificial Intelligence Surgery. https://doi.org/10.20517/ais.2021.15Google Scholar

Esteva, A, Chou, K, Yeung, S, Naik, N, Madani, A, Mottaghi, A, Liu, Y, Topol, E, Dean, J and Socher, R (2021) Deep learning-enabled medical computer vision. NPJ Digital Medicine 4(1): 1–5. https://doi.org/10.1038/s41746-020-00376-2Google Scholar

Ferreira, REP, Lee, YJ and Dórea, JRR (2023) Using pseudo-labeling to improve performance of deep neural networks for animal identification. Scientific Reports 13(1): 1–11.Google Scholar

Fokom Domgue, J, Dille, I, Gnangnon, F, Kapambwe, S, Bouchard, C, Mbatani, N, Gauroy, E, Ambounda, NL, Yu, R, Sidibe, F, Kamgno, J, Traore, B, Tebeu, P-M, Halle-Ekane, G, Diomande, MI, Dangou, J-M, Lecuru, F, Adewole, I, Plante, M, Basu, P and Shete, S (2024) Utility of colposcopy for the screening and management of cervical cancer in Africa: A cross-sectional analysis of providers’ training and practices. BMC Health Services Research 24(1): 1–8. https://doi.org/10.1186/s12913-024-11982-1Google Scholar

Freeman, K, Geppert, J, Stinton, C, Todkill, D, Johnson, S, Clarke, A and Taylor-Phillips, S (2021) Use of artificial intelligence for image analysis in breast cancer screening programmes: Systematic review of test accuracy. BMJ 374, n1872.Google Scholar

Galappaththi-Arachchige, HN, Holmen, S, Koukounari, A, Kleppa, E, Pillay, P, Sebitloane, M, Ndhlovu, P, van Lieshout, L, Vennervald, BJ, Gundersen, SG, Taylor, M and Kjetland, EF (2018) Evaluating diagnostic indicators of urogenital Schistosoma haematobium infection in young women: A cross sectional study in rural South Africa. PLoS One 13(2), e0191459.Google Scholar

Ganesan, P, Xue, Z, Singh, S, Long, R, Ghoraani, B and Antani, S (2019) Performance Evaluation of a Generative Adversarial Network for Deblurring Mobile-phone Cervical Images. 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC): 4487–4490.Google Scholar

Goodfellow, IJ, Pouget-Abadie, J, Mirza, M, Xu, B, Warde-Farley, D, Ozair, S, Courville, A and Bengio, Y (2014) Generative Adversarial Networks. arXiv.Google Scholar

Han, R, Acosta, JN, Shakeri, Z, Ioannidis, JPA, Topol, EJ and Rajpurkar, P (2024) Randomised controlled trials evaluating artificial intelligence in clinical practice: A scoping review. The Lancet Digital Health 6(5): e367–e373.Google Scholar

Hansun, S, Argha, A, Liaw, S-T, Celler, BG and Marks, GB (2023) Machine and deep learning for tuberculosis detection on chest X-Rays: Systematic literature review. Journal of Medical Internet Research 25, e43154.Google Scholar

He, K, Zhang, X, Ren, S and Sun, J (2015) Deep residual learning for image recognition. arXiv, 1–12.Google Scholar

Hegertun, IEA, Sulheim Gundersen, KM, Kleppa, E, Zulu, SG, Gundersen, SG, Taylor, M, Kvalsvig, JD and Kjetland, EF (2013) S. haematobium as a common cause of genital morbidity in girls: A Cross-sectional Study of Children in South Africa. PLOS Neglected Tropical Diseases 7(3): e2104.Google Scholar

Hernström, V, Josefsson, V, Sartor, H, Schmidt, D, Larsson, A-M, Hofvind, S, Andersson, I, Rosso, A, Hagberg, O and Lång, K (2025) Screening performance and characteristics of breast cancer detected in the Mammography Screening with Artificial Intelligence trial (MASAI): A randomised, controlled, parallel-group, non-inferiority, single-blinded, screening accuracy study. The Lancet Digital Health 7(3): e175–e183. https://doi.org/10.1016/S2589-7500(24)00267-XGoogle Scholar

Hinton, GE and Zemel, RS (1993) Autoencoders, minimum description length and Helmholtz free energy. Proceedings of the 7th International Conference on Neural Information Processing Systems. Denver, Colorado, Morgan Kaufmann Publishers Inc.: 3–10.Google Scholar

Hoekstra, PT, van Dam, GJ and van Lieshout, L (2021) Context-specific procedures for the diagnosis of human schistosomiasis – A mini review. Frontiers in Tropical Diseases 2 1–10. https://doi.org/10.3389/fitd.2021.722438Google Scholar

Hu, L, Bell, D, Antani, S, Xue, Z, Yu, K, Horning, MP, Gachuhi, N, Wilson, B, Jaiswal, MS, Befano, B, Long, LR, Herrero, R, Einstein, MH, Burk, RD, Demarco, M, Gage, JC, Rodriguez, AC, Wentzensen, N and Schiffman, M (2019) An observational study of deep learning and automated evaluation of cervical images for cancer screening. Journal of the National Cancer Institute 111(9), 923–932.Google Scholar

Huang, S-C, Pareek, A, Jensen, M, Lungren, MP, Yeung, S and Chaudhari, AS (2023) Self-supervised learning for medical image classification: A systematic review and implementation guidelines. NPJ Digital Medicine 6(1): 1–16. https://doi.org/10.1038/s41746-023-00811-0Google Scholar

Ilyas, T, Ahmad, K, Arsa, DMS, Jeong, YC and Kim, H (2024) Enhancing medical image analysis with unsupervised domain adaptation approach across microscopes and magnifications. Computers in Biology and Medicine 170, 108055.Google Scholar

Jin, E, Noble, JA and Gomes, M (2023) A review of computer-aided diagnostic algorithms for Cervical Neoplasia and an assessment of their applicability to female genital schistosomiasis. Mayo Clinical Proceedings: Digital Health 1(3): 247–257.Google Scholar

Jourdan, PM, Randrianasolo, BS, Feldmeier, H, Chitsulo, L, Ravoniarimbinina, P, Roald, B and Kjetland, EF (2013) Pathologic mucosal blood vessels in active female genital schistosomiasis: New aspects of a neglected tropical disease. International Society of Gynecological Pathologists 32(1): 137–140.Google Scholar

Kage, P, Rothenberger, JC, Andreadis, P and Diochnos, DI (2024) A Review of Pseudo-Labeling for Computer Vision. arXiv, 1–40.Google Scholar

Kjetland, EF, Hove, RJ, Gomo, E, Midzi, N, Gwanzura, L, Mason, P, Friis, H, Verweij, JJ, Gundersen, SG, Ndhlovu, PD, Mduluza, T and Van Lieshout, L (2009) Schistosomiasis PCR in vaginal lavage as an indicator of genital Schistosoma haematobium infection in rural Zimbabwean women. The American Journal of Tropical Medicine and Hygiene 81(6): 1050–1055.Google Scholar

Kjetland, EF, Kurewa, EN, Mduluza, T, Midzi, N, Gomo, E, Friis, H, Gundersen, SG and Ndhlovu, PD (2010) The first community-based report on the effect of genital Schistosoma haematobium infection on female fertility. Fertility and Sterility 94(4): 1551–1553.Google Scholar

Kjetland, EF, Kurewa, EN, Ndhlovu, PD, Midzi, N, Gwanzura, L, Mason, PR, Gomo, E, Sandvik, L, Mduluza, T, Friis, H and Gundersen, SG (2008) Female genital schistosomiasis – A differential diagnosis to sexually transmitted disease: Genital itch and vaginal discharge as indicators of genital Schistosoma haematobium morbidity in a cross-sectional study in endemic rural Zimbabwe. Tropical Medicine and International Health 13(12): 1509–1517.Google Scholar

Kjetland, EF, Leutscher, PD and Ndhlovu, PD (2012) A review of female genital schistosomiasis. Trends in Parasitology 28(2): 58–65.Google Scholar

Kjetland, EF, Mduluza, T, Ndhlovu, PD, Gomo, E, Gwanzura, L, Midzi, N, Mason, PR, Friis, H and Gundersen, SG (2006) Genital schistosomiasis in women: A clinical 12-month in vivo study following treatment with praziquantel. Transactions of the Royal Society of Tropical Medicine and Hygiene 100(8), 740–752.Google Scholar

Kjetland, EF, Ndhlovu, PD, Mduluza, T, Gomo, E, Gwanzura, L, Mason, PR, Kurewa, EN, Midzi, N, Friis, H and Gundersen, SG (2005) Simple clinical manifestations of genital Schistosoma haematobium infection in rural Zimbabwean women. American Journal of Tropical Medicine and Hygiene 72(3): 311–319.Google Scholar

Kjetland, EF, Norseth, HM, Taylor, M, Lillebø, K, Kleppa, E, Holmen, SD, Andebirhan, A, Yohannes, TH, Gundersen, SG, Vennervald, BJ, Bagratee, J, Onsrud, M and Leutscher, PDC (2014) Classification of the lesions observed in female genital schistosomiasis. International Journal of Gynecology & Obstetrics 127(3): 227–228.Google Scholar

Kjetland, EF, Poggensee, G, Helling-Giese, G, Richter, J, Sjaastad, A, Chitsulo, L, Kumwenda, N, Gundersen, SG, Krantz, I and Feldmeier, H (1996) Female genital schistosomiasis due to Schistosoma haematobium Clinical and parasitological findings in women in rural Malawi. Acta Tropica 62(4): 239–255.Google Scholar

Lamberti, O, Bozzani, F, Kiyoshi, K and Bustinduy, AL (2024a) Time to bring female genital schistosomiasis out of neglect. British Medical Bulletin 149(1): 45–59.Google Scholar

Lamberti, O, Kayuni, S, Kumwenda, D, Ngwira, B, Singh, V, Moktali, V, Dhanani, N, Wessels, E, Van Lieshout, L, Fleming, FM, Mzilahowa, T and Bustinduy, AL (2024b) Female genital schistosomiasis burden and risk factors in two endemic areas in Malawi nested in the Morbidity Operational Research for Bilharziasis Implementation Decisions (MORBID) cross-sectional study. PLOS Neglected Tropical Diseases 18(5), e0012102.Google Scholar

LeCun, Y, Boser, BE, Denker, JS, Henderson, D, Howard, RE, Hubbard, WE and Jackel, LD (1989) Handwritten Digit Recognition with a Back-Propagation Network. 3rd International Conference on Neural Information Processing Systems.Google Scholar

Lee, D-H (2013) Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks. ICML 2013 Workshop : Challenges in Representation Learning (WREPL): 1–6.Google Scholar

Leutscher, PDC, Ramarokoto, CE, Hoffmann, S, Jensen, JS, Ramaniraka, V, Randrianasolo, B, Raharisolo, C, Migliani, R and Christensen, N (2008) Coexistence of urogenital schistosomiasis and sexually transmitted infection in women and men living in an area where Shistosoma haematobium endemic. Clinical Infectious Diseases 47(6), 775–782.Google Scholar

Lindroth, H, Nalaie, K, Raghu, R, Ayala, IN, Busch, C, Bhattacharyya, A, Moreno Franco, P, Diedrich, DA, Pickering, BW and Herasevich, V (2024) Applied artificial intelligence in healthcare: A Review of Computer Vision Technology Application in Hospital Settings. Journal of Imaging 10(4), 81.Google Scholar

Lingscheid, T, Kurth, F, Clerinx, J, Marocco, S, Trevino, B, Schunk, M, Muñoz, J, Gjørup, IE, Jelinek, T, Develoux, M, Fry, G, Jänisch, T, Schmid, ML, Bouchaud, O, Puente, S, Zammarchi, L, Mørch, K, Björkman, A, Siikamäki, H, Neumayr, A, Nielsen, H, Hellgren, U, Paul, M, Calleri, G, Kosina, P, Myrvang, B, Ramos, JM, Just-Nübling, G, Beltrame, A, Saraiva Da Cunha, J, Kern, P, Rochat, L, Stich, A, Pongratz, P, Grobusch, MP, Suttorp, N, Witzenrath, M, Hatz, C and Zoller, T (2017) Schistosomiasis in European travelers and migrants: Analysis of 14 Years TropNet Surveillance Data. The American Society of Tropical Medicine and Hygiene 97(2), 567–574.Google Scholar

Liu, L, Wang, Y, Liu, X, Han, S, Jia, L, Meng, L, Yang, Z, Chen, W, Zhang, Y and Qiao, X (2021) Computer-aided diagnostic system based on deep learning for classifying colposcopy images. Annals of Translational Medicine 9(13), 1045–1045.Google Scholar

Ma, J, He, Y, Li, F, Han, L, You, C and Wang, B (2024) Segment anything in medical images. Nature Communications 15(1), 1–9Google Scholar

Marchese, V, Beltrame, A, Angheben, A, Monteiro, GB, Giorli, G, Perandin, F, Buonfrate, D and Bisoffi, Z (2018) Schistosomiasis in immigrants, refugees and travellers in an Italian referral centre for tropical diseases. Infectious Diseases of Poverty 7(1), 1–10. https://doi.org/10.1186/s40249-018-0440-5Google Scholar

Mbwanji, G, Ndaboine, E, Yusuf, AJ, Kabona, G, Marwa, B and Mazigo, HD (2024) High sensitivity but low specificity of the risk factors and symptoms questionnaire in diagnosing female genital schistosomiasis among sexually active women with genital lesions in selected villages of Maswa District, North-Western Tanzania. PLOS Neglected Tropical Diseases 18(8), e0012336.Google Scholar

Moner-Girona, M, Kakoulaki, G, Falchetta, G, Weiss, DJ and Taylor, N (2021) Achieving universal electrification of rural healthcare facilities in sub-Saharan Africa with decentralized renewable energy technologies. Joule 5(10), 2687–2714.Google Scholar

Musa, SM, Haruna, UA, Manirambona, E, Eshun, G, Ahmad, DM, Dada, DA, Gololo, AA, Musa, SS, Abdulkadir, AK and Lucero-Prisno Iii, DE (2023) Paucity of health data in Africa: An obstacle to digital health implementation and evidence-based practice. Public Health Reviews 44, 1–5.Google Scholar

Ndubani, R, Lamberti, O, Kildemoes, A, Hoekstra, P, Fitzpatrick, J, Kelly, H, Vwalika, B, Randrianasolo, B, Sturt, A, Kayuni, S, Choko, A, Kasese, N, Kjetland, E, Nemungadi, T, Mocumbi, S, Samson, A, Ntapara, E, Thomson, A, Danstan, E, Chikwari, C, Martin, K, Rabiu, I, Terkie, G, Chaima, D, Kasoka, M, Joeker, K, Arenholt, L, Leutscher, P, Stothard, R, Rabozakandria, O, Gouvras, A, Munthali, T, Hameja, G, Kanfwa, P, Hikabasa, H, Ayles, H, Shanaube, K and Bustinduy, A (2024) The first BILGENSA Research Network workshop in Zambia: Identifying research priorities, challenges and needs in genital bilharzia in Southern Africa. [version 1; peer review: 1 approved, 3 approved with reservations]. Wellcome Open Research 9(360), 1–16Google Scholar

Neloy, AA and Turgeon, M (2024) A comprehensive study of auto-encoders for anomaly detection: Efficiency and trade-offs. Machine Learning with Applications 17, 100572.Google Scholar

Nemungadi, TG, Kleppa, E, van Dam, GJ, Corstjens, PLAM, Galappaththi-Arachchige, HN, Pillay, P, Gundersen, SG, Vennervald, BJ, Ndhlovu, P, Taylor, M, Naidoo, S and Kjetland, EF (2022) Female genital schistosomiasis lesions explored using circulating anodic antigen as an indicator for live schistosoma worms. Frontiers in Tropical Diseases 3, 1–7.Google Scholar

Norseth, HM, Ndhlovu, PD, Kleppa, E, Randrianasolo, BS, Jourdan, PM, Roald, B, Holmen, SD, Gundersen, SG, Bagratee, J, Onsrud, M and Kjetland, EF (2014) The colposcopic atlas of schistosomiasis in the lower female genital tract based on Studies in Malawi, Zimbabwe, Madagascar and South Africa. PLOS Neglected Tropical Diseases 8(11), e3229. https://doi.org/10.1371/journal.pntd.0003229Google Scholar

Patel, AA (2019) Hands-on Unsupervised Learning with Python. Sebastopol, CA: O’Reilly Media.Google Scholar

Pietrołaj, M and Blok, M (2024) Resource constrained neural network training. Scientific Reports 14(1), 1–13. https://doi.org/10.1038/s41598-024-52356-1Google Scholar

Poggensee, G, Kiwelu, I, Weger, V, Göppner, D, Diedrich, T, Krantz, I and Feldmeier, H (2000) Female genital schistosomiasis of the lower genital tract: Prevalence and disease‐associated morbidity in Northern Tanzania. Journal of Infectious Diseases 181(3), 1210–1213.Google Scholar

Prendiville, W and Sankaranarayanan, R (2017) Colposcopy and Treatment of Cervical Precancer. Geneva, Switzerland: WHO Press.Google Scholar

Rafferty, H, Sturt, AS, Phiri, CR, Webb, EL, Mudenda, M, Mapani, J, Corstjens, P, van Dam, GJ, Schaap, A, Ayles, H, Hayes, RJ, van Lieshout, L, Hansingo, I and Bustinduy, AL (2021) Association between cervical dysplasia and female genital schistosomiasis diagnosed by genital PCR in Zambian women. Bmc Infectious Diseases 21(1), 1–9. https://doi.org/10.1186/s12879-021-06380-5Google Scholar

Randrianasolo, BS, Jourdan, PM, Ravoniarimbinina, P, Ramarokoto, CE, Rakotomanana, F, Ravaoalimalala, VE, Gundersen, SG, Feldmeier, H, Vennervald, BJ, van Lieshout, L, Roald, B, Leutscher, P and Kjetland, EF (2015) Gynecological manifestations, histopathological findings, and schistosoma-specific polymerase chain reaction results among women with Schistosoma haematobium infection: A cross-sectional study in Madagascar. Journal of Infectious Diseases 212(2), 275–284.Google Scholar

Ronneberger, O, Fischer, P and Box, T (2015) U-Net: Convolutional Neural Networks for Biomedical Image Segmentation. Miccai 9351, 1–8.Google Scholar

Salas-Coronas, J, Vázquez-Villegas, J, Lozano-Serrano, AB, Soriano-Pérez, MJ, Cabeza-Barrera, I, Cabezas-Fernández, MT, Villarejo-Ordóñez, A, Sánchez-Sánchez, JC, Abad Vivas-Pérez, JI, Vázquez-Blanc, S, Palanca-Giménez, M and Cuenca-Gómez, JA (2020) Severe complications of imported schistosomiasis, Spain: A retrospective observational study. Travel Medicine and Infectious Disease 35, 101508.Google Scholar

Scott, AJ, Perumal, T, Pooran, A, Oelofse, S, Jaumdally, S, Swanepoel, J, Gina, P, Mthiyane, T, Qin, ZZ, Fehr, J, Grant, AD, Wong, EB, Van Der Walt, M, Esmail, A and Dheda, K (2025) Clinical evaluation of computer-aided digital X-ray detection of pulmonary tuberculosis during community-based screening or active case-finding: A case–control study. The Lancet Global Health 13(3), e517–e527.Google Scholar

Sepehri, K, Song, X, Proulx, R, Hajra, SG, Dobberthien, B, Liu, CC, D’Arcy, RCN, Murray, D and Krauze, AV (2021) Towards effective machine learning in medical imaging analysis: A novel approach and expert evaluation of high-grade glioma ‘ground truth’ simulation on MRI. International Journal of Medical Informatics 146, 104348.Google Scholar

Shen, D, Wu, G and Suk, H-I (2017) Deep learning in medical image analysis. Annual Review of Biomedical Engineering 19(1), 221–248.Google Scholar

Søfteland, S, Sebitloane, MH, Taylor, M, Roald, BB, Holmen, S, Galappaththi‐Arachchige, HN, Gundersen, SG and Kjetland, EF (2021) A systematic review of handheld tools in lieu of colposcopy for cervical neoplasia and female genital schistosomiasis. International Journal of Gynecology & Obstetrics 153(2), 190–199.Google Scholar

Spathis, D, Perez-Pozuelo, I, Marques-Fernandez, L and Mascolo, C (2022) Breaking away from labels: The promise of self-supervised machine learning in intelligent health. Patterns 3(2), 100410.Google Scholar

Stepec, D and Skocaj, D (2021) Unsupervised Detection of Cancerous Regions in Histology Imagery using Image-to-Image Translation. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE Computer Society: 3780–3787.Google Scholar

Sturt, A, Bristowe, H, Webb, E, Hansingo, I, Phiri, C, Mudenda, M, Mapani, J, Mweene, T, Levecke, B, Cools, P, Dam, GV, Corstjens, P, Ayles, H, Hayes, R, Francis, S, Lieshout, LV, Vwalika, B, Kjetland, E and Bustinduy, A (2023) Visual diagnosis of female genital schistosomiasis in Zambian women from hand-held colposcopy: Agreement of expert image review. Wellcome Open Research 8(14), 1–36Google Scholar

Sturt, AS, Webb, EL, Himschoot, L, Phiri, CR, Mapani, J, Mudenda, M, Kjetland, EF, Mweene, T, Levecke, B, Van Dam, GJ, Corstjens, PLAM, Ayles, H, Hayes, RJ, Van Lieshout, L, Hansingo, I, Francis, SC, Cools, P and Bustinduy, AL (2021) Association of female genital schistosomiasis with the cervicovaginal microbiota and sexually transmitted infections in Zambian Women. Open Forum Infectious Diseases 8(9), 1–10. https://doi.org/10.1093/ofid/ofab438Google Scholar

Sturt, AS, Webb, EL, Phiri, CR, Mapani, J, Mudenda, M, Himschoot, L, Kjetland, EF, Mweene, T, Levecke, B, van Dam, GJ, Corstjens, P, Ayles, H, Hayes, RJ, Francis, SC, van Lieshout, L, Cools, P, Hansingo, I and Bustinduy, AL (2022) The presence of hemoglobin in cervicovaginal lavage is not associated with Genital Schistosomiasis in Zambian Women from the BILHIV Study. Open Forum Infectious Diseases 9(12), ofac586.Google Scholar

Sturt, AS, Webb, EL, Phiri, CR, Mweene, T, Chola, N, Van Dam, GJ, Corstjens, PLAM, Wessels, E, Stothard, JR, Hayes, R, Ayles, H, Hansingo, I, Van Lieshout, L and Bustinduy, AL (2020) Genital self-sampling compared with cervicovaginal lavage for the diagnosis of female genital schistosomiasis in Zambian women: The BILHIV study. PLOS Neglected Tropical Diseases 14(7), e0008337.Google Scholar

Takahashi, S, Sakaguchi, Y, Kouno, N, Takasawa, K, Ishizu, K, Akagi, Y, Aoyama, R, Teraya, N, Bolatkan, A, Shinkai, N, Machino, H, Kobayashi, K, Asada, K, Komatsu, M, Kaneko, S, Sugiyama, M and Hamamoto, R (2024) Comparison of vision transformers and convolutional neural networks in medical image analysis: A systematic review. Journal of Medical Systems 48(1), 1–222. https://doi.org/10.1007/s10916-024-02105-8Google Scholar

The Lancet Digital H (2023) Technology for world elimination of neglected tropical diseases. The Lancet Digital Health 5(2), e51.Google Scholar

Ting, DSW, Cheung, CY-L, Lim, G, Tan, GSW, Quang, ND, Gan, A, Hamzah, H, Garcia-Franco, R, San Yeo, IY, Lee, SY, Wong, EYM, Sabanayagam, C, Baskaran, M, Ibrahim, F, Tan, NC, Finkelstein, EA, Lamoureux, EL, Wong, IY, Bressler, NM, Sivaprasad, S, Varma, R, Jonas, JB, He, MG, Cheng, C-Y, Cheung, GCM, Aung, T, Hsu, W, Lee, ML and Wong, TY (2017) Development and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes. Jama 318(22), 2211.Google Scholar

Van Bergen, KJM, Brienen, EAT, Randrianasolo, BS, Ramarokoto, CE, Leutscher, P, Kjetland, EF, Van Diepen, A, Dekker, F, Saggiomo, V, Velders, AH and Van Lieshout, L (2024) Next step towards point-of-care molecular diagnosis of female genital schistosomiasis (FGS): Evaluation of an instrument-free LAMP procedure. Frontiers in Parasitology 3, 1–11. https://doi.org/10.3389/fpara.2024.1297310Google Scholar

Van Engelen, JE and Hoos, HH (2020) A survey on semi-supervised learning. Machine Learning 109(2), 373–440.Google Scholar

Vargas‐Cardona, HD, Rodriguez‐Lopez, M, Arrivillaga, M, Vergara‐Sanchez, C, García‐Cifuentes, JP, Bermúdez, PC, and Jaramillo‐Botero, A (2023) Artificial intelligence for cervical cancer screening: Scoping review, 2009–2022. International Journal of Gynecology & Obstetrics, 566–578.Google Scholar

Wang, C-Y, Bochkovskiy, A and Liao, H-YM (2022) YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors arXiv, 1–15.Google Scholar

Wang, S, Li, C, Wang, R, Liu, Z, Wang, M, Tan, H, Wu, Y, Liu, X, Sun, H, Yang, R, Liu, X, Chen, J, Zhou, H, Ben Ayed, I and Zheng, H (2021) Annotation-efficient deep learning for automatic medical image segmentation. Nature Communications 12(1): 1–13.Google Scholar

Wang, X and Gupta, A (2015) Unsupervised Learning of Visual Representations using Videos. arXiv: 1–9.Google Scholar

World Health Organization (2015) Female Genital Schistosomiasis, A Pocket Atlas for Clinical Healthcare Professionals. Geneva, Switzerland: WHO Press.Google Scholar

World Health Organization (2021) Ethics and Governance of Artificial Intelligence for Health. Geneva, Switzerland: World Health Organization.Google Scholar

World Health Organization (2024) The Role of Artificial Intelligence in Sexual and Reproductive Health and Rights. Geneva, Switzerland: World Health Organization.Google Scholar

Xue, P, Ng, MTA and Qiao, Y (2020) The challenges of colposcopy for cervical cancer screening in LMICs and solutions by artificial intelligence. BMC Medicine 18(1), 1–7. https://doi.org/10.1186/s12916-020-01613-xGoogle Scholar

Yang, H, Aydi, W, Innab, N, Ghoneim, ME and Ferrara, M (2024) Classification of cervical cancer using Dense CapsNet with Seg-UNet and denoising autoencoders. Scientific Reports 14(1), 1–15. https://doi.org/10.1038/s41598-024-82489-2Google Scholar

Zech, JR, Badgeley, MA, Liu, M, Costa, AB, Titano, JJ and Oermann, EK (2018) Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study. PLOS Medicine 15(11), e1002683.Google Scholar

Table 1. Common methods, use cases and architecture examples for computer vision

Table 2. The barriers to visual diagnostics for female genital schistosomiasis (FGS) and the potential computer vision-based solutions

Article contents

Visual diagnostics for female genital schistosomiasis and the opportunity for improvement using computer vision

Abstract

Keywords

Information

Introduction

The current landscape of visual diagnostics

Why should computer vision be applied to FGS visual diagnostics?

How could computer vision be used for FGS?

What are the main challenges to using computer vision for female genital schistosomiasis?

Visual heterogeneity and ground truth annotation

Small and homogenous datasets

Troubleshooting the challenges in FGS computer vision

General troubleshooting

Annotation troubleshooting

Ground truth troubleshooting

Small dataset troubleshooting

Generalizability troubleshooting

Implementation

A pathway forward

Author’s contributions

Financial support

Competing interests

Ethical standards

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests