Photometrically selected protocluster candidates at in the JWST COSMOS-Web field

Cossas K.-W. Wu*: Affiliation:
Institute of Astronomy, National Tsing Hua University, Hsinchu, Taiwan
Chih-Teng Ling: Affiliation:
Department of Astronomical Science, SOKENDAI (The Graduate University for Advanced Studies), Mitaka, Tokyo, Japan National Astronomical Observatory of Japan, Mitaka, Tokyo, Japan
Tomotsugu Goto: Affiliation:
Institute of Astronomy, National Tsing Hua University, Hsinchu, Taiwan Department of Physics, National Tsing Hua University, Hsinchu, Taiwan
Amos Y.A. Chen: Affiliation:
Department of Physics, National Tsing Hua University, Hsinchu, Taiwan
Tetsuya Hashimoto: Affiliation:
Department of Physics, National Chung Hsing University, Taichung, Taiwan
Seong Jin Kim: Affiliation:
Institute of Astronomy, National Tsing Hua University, Hsinchu, Taiwan
Simon C.-C. Ho: Affiliation:
Research School of Astronomy and Astrophysics, The Australian National University, Canberra, ACT, Australia Centre for Astrophysics and Supercomputing, Swinburne University of Technology, Hawthorn, VIC, Australia OzGrav: The Australian Research Council Centre of Excellence for Gravitational Wave Discovery, Hawthorn, VIC, Australia ASTRO3D: ARC Centre of Excellence for All-Sky Astrophysics in 3D, ACT, Australia
Ece Kilerci: Affiliation:
Department of Astronomy and Space Sciences, Science Faculty, ˙Istanbul University, Beyazıt, ˙Istanbul, Türkiye
Yu-Yang Hsiao: Affiliation:
Centre for Astrophysics | Harvard & Smithsonian, Cambridge, MA, USA Centre for Astrophysical Sciences, Department of Physics and Astronomy, The Johns Hopkins University, Baltimore, MD, USA Space Telescope Science Institute (STScI), Baltimore, MD, USA
Yuri Uno: Affiliation:
Department of Physics, National Chung Hsing University, Taichung, Taiwan
Terry Long Phan: Affiliation:
Institute of Astronomy, National Tsing Hua University, Hsinchu, Taiwan
*: Corresponding author: Cossas K.-W. Wu; Email: danniel258000@gmail.com

Article contents

Abstract
Introduction
Data
Methodology
Results and discussion
Conclusions
Data availability statement
Footnotes
References

Rights & Permissions

Abstract

High-redshift protoclusters are crucial for understanding the formation of galaxy clusters and the evolution of galaxies in dense environments. The James Webb Space Telescope (JWST), with its unprecedented near-infrared sensitivity, enables the first exploration of protoclusters beyond $ z \gt 10 $. Among JWST surveys, COSMOS-Web Data Release 0.5 offers the largest area ($\sim 0.27$ deg$^2$), making it an optimal field for protocluster searches. In this study, we searched for protoclusters at $ z \sim 9-10 $ using 366 F115W dropout galaxies. We evaluated the reliability of our photometric redshift by validation tests with the JADES DR3 spectroscopic sample, obtaining the likelihood of falsely identifying interlopers as $\sim25\%$. Overdensities ($\delta$) are computed by weighting galaxy positions with their photometric redshift probability density functions, using a 2.5 cMpc aperture and a redshift slice of $\pm 0.5$. We selected the most promising core galaxies of protocluster candidate galaxies with an overdensity greater than the 95th percentile of the distribution of 366 F115W dropout galaxies. The member galaxies are then linked within an angular separation of 7.5 cMpc to the core galaxies, finding seven protocluster candidates. These seven protocluster candidates have inferred halo masses of $ M_{\text{halo}} \sim 10^{11}\,{\rm M}_{\odot} $. The detection of such overdensities at these redshifts provides a critical test for current cosmological simulations. However, confirming these candidates and distinguishing them from low-redshift dusty star-forming galaxies or Balmer-break galaxies will require follow-up near-infrared spectroscopic observations.

Keywords

Galaxies: high-redshift galaxies: clusters: general infrared: galaxies

Information

Type: Research Article
Information: Publications of the Astronomical Society of Australia , Volume 42 , 2025 , e141

DOI: https://doi.org/10.1017/pasa.2025.10096 [Opens in a new window]

NASA ADS Abstract Service [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2025. Published by Cambridge University Press on behalf of Astronomical Society of Australia

1. Introduction

The subtle density fluctuations in the early Universe lead to the aggregation of galaxies. Over time, the galaxies formed protoclusters, which are the first assemblies of galaxies in which all the haloes/galaxies will eventually merge into the final galaxy cluster in the local Universe (Muldrew, Hatch, & Cooke Reference Muldrew, Hatch and Cooke2015). As the most extreme matter overdensities within the cosmic web, galaxy clusters reside at the nodes of this large-scale structure permitted by the standard cosmological framework of hierarchical structure formation (White & Rees Reference White and Rees1978), protoclusters are also expected to trace the characteristics of the large-scale structures (LSS). Given their rarity, particularly during the early Epoch of Reionisation (EoR), confirming the existence of protoclusters not only serves as an observational milestone but also provides crucial insights into the evolutionary history of the Universe. Despite their significance, our understanding of how galaxies evolve and organise into protoclusters, eventually becoming part of the LSS, remains limited. This knowledge gap arises primarily due to the challenges in achieving the deep sensitivity required at near-infrared wavelengths, where much of the critical data about these distant and early structures resides. Addressing these challenges is essential to unravel the complexities of galaxy evolution and the emergence of large-scale cosmic structures.

During the past half-decade, astronomers have found many galaxy protoclusters at $2 \lt z \lt 8.5$ (Higuchi et al. Reference Higuchi2019; Harikane et al. Reference Harikane2019; Hu et al. Reference Hu2021; Polletta et al. Reference Polletta2021; Laporte et al. Reference Laporte, Zitrin, Dole, Roberts-Borsani, Furtak and Witten2022; Brinch et al. Reference Brinch2024; Helton et al. Reference Helton2024a; Wang et al. Reference Wang2024; Chen et al. Reference Chen, Stark, Mason, Topping, Whitler, Tang, Endsley and Charlot2024; Fudamoto et al. Reference Fudamoto2025, etc.). We expect to extend our knowledge about the evolution of galaxies in the distant Universe ( $z \gt 8$ ), by utilising the James Webb Space Telescope (Gardner et al. (Reference Gardner2023), JWST) to reveal the mysterious early galaxies with its unprecedented sensitivity and resolution in Infrared (IR).

Recent studies from JWST have begun to uncover the existence of spectroscopically confirmed high-z protoclusters: Laporte et al. (Reference Laporte, Zitrin, Dole, Roberts-Borsani, Furtak and Witten2022) found a lensed protocluster candidate at $z = 7.66$ behind the background of the SMACS0723-7327 galaxy cluster, and Chen et al. (Reference Chen, Stark, Mason, Topping, Whitler, Tang, Endsley and Charlot2024) reports an overdensity of Lyman- $\alpha$ galaxies at $z = 7.88$ in the CEERS EGS field. Above $z \gt 8$ , Larson et al. (Reference Larson2022) discovered a potential protocluster comprising a pair of galaxies at $z_{\text{spec}} \sim 8.7$ with a separation of 3.5 proper Mpc ( $\sim$ 34 comoving Mpc (cMpc)) among 11 galaxies at $z \gt 8$ in the CANDELS survey fields ( $\sim 800$ arcmin $^{2}$ ). Similarly, Helton et al. (Reference Helton2024b) confirmed an overdensity includes six galaxies at $z_{\text{spec}}=8.22$ spanning across 2.2 cMpc.

Observed quantities such as the number density of protoclusters show us the overview of the structural development in the early Universe. However, the previous JWST vision from Helton et al. (Reference Helton2024b) is limited to the survey area of 134 arcmin $^{2}$ of JWST Advanced Deep Extragalactic Survey (JADES. Eisenstein et al. Reference Eisenstein2023). A larger survey area will put more constraints on the number density of the protoclusters. The COSMOS-Web survey from Cycle 1 JWST treasury program (Casey et al. Reference Casey2023) provides an opportunity to search for high-redshift protoclusters with extensive coverage and multi-wavelength auxiliary data. The survey area is superior to others because it has been observed by major space-based (HST, Spitzer, Herschel, and others) and ground-based telescopes (Keck, Subaru, and others). Also, the COSMOS-Web field has the largest survey area ( $\sim$ 0.5 deg $^2$ ) among the JWST surveys to date. Furthermore, simply scaling the number density of very distant protoclusters (candidates) found by Larson et al. (Reference Larson2022) and Helton et al. (Reference Helton2024b), we expected that the field of view of COSMOS-Web would contain $\sim 1-2$ protoclusters in $z\sim9$ using the survey area ratio. Based on the reasons above, COSMOS-Web naturally becomes the best site for discovering any potential protoclusters in the early Universe.

Here is an overview of this work. In Section 2, we introduce the basic data properties of the COSMOS-Web field. We summarise the construction of a multi-band catalogue in Section 2.2. We then explain the methodology in Section 3, including applying colour-selection criteria to the photometry results. We discuss the results of spectral energy distribution(SED) fitting to high-redshift galaxy candidates in Section 3.2. Lastly, in Section 4, we consider the possible clustering of the early galaxies by the overdensity.

If else specified, we adopt the Planck15 cosmology (Adam et al. Reference Adam2016), i.e. $\Lambda$ cold dark matter cosmology with ( $\Omega_{m}$ , $\Omega_{\Lambda}$ , $\Omega_{b}$ , h)=(0.307, 0.693, 0.0486, 0.677).

2. Data

This work utilizes the public data release 0.5 (hereafter DR0.5) from the COSMOS-Web survey (GO #1727, PIs Kartaltepe & Casey; Casey et al. Reference Casey2023). The DR0.5 contains the observations undergone during early January and June/July 2023 (observation Nos. 043-048 and 078-153), and covers 0.29 deg $^{2}$ in total. This release separates the data into 10 mosaic images, A1–A10, accordingly. Near-Infrared Camera (NIRCam) observations are conducted with four filters (F115W, F150W, F277W, F444W), while Mid-Infrared Instrument (MIRI) observations are conducted with a single filter (F770W). The data is available on the official website from the COSMOS team:Footnote ^a .

2.1 Data reduction

The raw images from JWST were processed using the JWST Calibration Pipeline v1.10.0 with CRDS Context pmap = 1 075 (Bushouse et al. Reference Bushouse2022). The COSMOS team implemented extra customised modifications and reduction techniques to enhance the data product. The astrometric precision of the NIRCam mosaics has been improved through alignment with the Gaia-DR3 (Weaver et al. Reference Weaver2022) and the COSMOS2020 catalogue. This was enabled by using a reference catalogue based on COSMOS HST/F814W imaging data, which had been meticulously reprocessed following the methodology originally described in Koekemoer et al. (Reference Koekemoer2011).

2.2 Multi-wavelength photometric catalogue

In this section, we describe the details of building our multi-wavelength photometric catalogue. We use Source-Extractor V2.19.5 (hereafter SEx, Bertin & Arnouts Reference Bertin and Arnouts1996) for source extraction and photometry. We use the default parameter except for ${\tt DETECT\_THRESH}=2.5$ , ${\tt DEBLEND\_NTHRESH}=48$ , and ${\tt MAG\_ZEROPOINT}$ in the configure file for SEx. ${\tt MAG\_ZEROPOINT}$ for each filter is set to the numbers obtained by the first equation on the NIRCam zeropoints website.Footnote ^b To efficiently identify the dropout sources by colours, we enable the dual mode in the SEx. Dual mode allows us to perform force photometry on measurement images for every source that is detected in the detection image, but it would require identical pixel coordinates between both images. For each mosaic image from DR0.5, we co-add F150W, F277W, and F444W as the detection image with the default parameter in the SWarp (Bertin Reference Bertin2010). Measurement images are simply the re-projected and resampled mosaic images from DR0.5 on the same x,y coordinates as the detection images. Both the detection image and measurement image are resampled to the pixel scale of 0.03 arcsec by the authors as part of a customised preprocessing step.

We measure the flux within a 0".15 radius aperture, which is slightly larger than the FWHM of the point-spread function in the F444W band (0".145), then apply the aperture correction with the interpolated coefficients derived from the JWST CRDS.Footnote ^c

Our JWST photometric catalogue is cross-matched to the COSMOS-2020 catalogue (Weaver et al. Reference Weaver2022) with a maximum error tolerance of 75 milliarcsec (2 $\sigma$ ) for each pair. We use the median of the difference in each coordinate to align HST and JWST cutout images. The COSMOS-2020 catalogue (Weaver et al. Reference Weaver2022) concludes the previous results from various telescopes, such as UVISTA, HSC, and HST. We further compare the observed AUTO magnitude between JWST NIRCam F150W and the UVISTA H band for every matched source. We found that the median of the difference between the UVISTA H band AUTO magnitudes and JWST NIRCam F150W is less than $\lt 25\%$ down to $\sim$ 27.0 AB magnitude.

After cross-matching the COSMOS-2020 catalogue, we performed forced photometry on the HST image with Photoutils for the unmatched sources. To evaluate the flux uncertainty of forced photometry on the F814W band, we placed a 0.3"–0.6" annulus centred on the source to determine the flux uncertainty with Photutils. The flux uncertainty in the other JWST bands was measured with the SEx using the same 0.3"–0.6" annulus that included photon and detector noise.

3. Methodology

In this section, we first follow the colour criteria outlined by Harikane et al. (Reference Harikane2023) to identify F115W dropout galaxies. Then, we estimate the photometric redshift (photo-z) and generate a redshift probability distribution function (PDF) for the selected candidates. Finally, we explain the method for estimating the number density of the galaxies as well as their overdensity.

3.1 Colour selection

Harikane et al. (Reference Harikane2023, H+23) set an example for selecting high-redshift galaxy candidates by simple yet efficient colour selection in recent JWST observations. The available four major NIRCam filters in COSMOS-Web (F115W, F150W, F277W, and F444W) could only have identical selection criteria (Eq. 1) for picking up F115W dropouts in H+23.

(1)

\begin{align} F115W-F150W &\gt 1.0 \ \wedge \nonumber \\ F150W-F277W &\lt 1.0 \ \wedge \\ F115W-F150W &\gt F150W-F277W + 1.0 \nonumber \end{align}

We measure the colours of each object using a fixed circular aperture with a diameter of 0.3 arcsec and apply the aperture correction from the coefficients listed in the JWST CRDS. Figure 1 shows the distribution of our selected galaxies on the two-colour diagram. In total, we selected 3 739 objects with matched colours.

Figure 1. The two-colour diagram for the sources have $S/N \gt 2$ in F115W+F277W+F444W detection image of COSMOS-Web DR0.5 (Casey et al. Reference Casey2023). The green area shows colours that satisfy the F115W-dropout criteria from Harikane et al. (Reference Harikane2023). The colour is measured with a 0.3" diameter circular aperture. The numbers indicate how many sources there are in each subset.

For sufficient robustness of the selection of dropout galaxies, we adopt a 2 $\sigma$ flux density cut in the F814W and F115W bands within a circular aperture of 0.3" diameter (Harikane et al. Reference Harikane2022, Reference Harikane2023; Hainline et al. Reference Hainline2024). We further rejected 2 864 objects out of 3 739 objects with S/N $\gt$ 2 in any of the F814W and F115W filters based on the results of forced photometry to avoid interference, leaving us with 875 objects after this process.

3.1.1 Comparison with JADES

Moreover, to show the reliability of the colour-selected samples, we compare them with spectroscopically confirmed galaxies. We first prepared all sources that have both NIRCam and NIRSpec observations in the JADES field. According to JADES DR3 (D’Eugenio et al. Reference D’Eugenio2025), we then selected galaxies only with the quality of the spectra ranks A, B, and C, which have secure and visually identified emission lines. The flux uncertainty of this photometric dataset is then downgraded to the typical noise levels expected in COSMOS-Web. In Figure 2, we show the spectra of GN-z11 with the depth of the COSMOS-Web at each wavelength, indicating that the depth is enough to detect bright galaxies up to $z\sim 11$ . However, we found that no galaxies in the JADES DR3 catalogue satisfy both the COSMOS-Web colour criteria and the signal-to-noise thresholds while also being brighter than the COSMOS-Web limiting magnitudes in the F150W, F277W, and F444W bands. We attribute the lack of bright sources to the small field of view of the JADES. Therefore, to further test the reliability of the colour-selected sample, we relaxed the signal-to-noise constraints and applied only the colour criteria.

Figure 2. Spectral energy distribution (SED) of the spectroscopically confirmed galaxy GN-z11 at redshift $z=10.60$ (black line), overlaid with the 5 $\sigma$ detection limits of various filters used in this study. The coloured upward arrows indicate 5 $\sigma$ limiting depths in each band: F814W (purple), F115W (light blue), F150W (green), F277W (orange), and F444W (red). For each band, the fainter (transparent) arrows denote the shallower 5 $\sigma$ depths reached in approximately 50% of the survey area, due to non-uniform coverage and exposure time. Fluxes are shown in nJy on the left y-axis, with the corresponding AB magnitudes on the right y-axis. Horizontal error bars represent the approximate width of each filter’s transmission curve. This figure illustrates the ability of the JWST NIRCam bands to probe the rest-frame UV-to-optical emission of galaxies at $z\gt10$ .

Figure 3 shows the contamination rate in the selection of the colour criteria, which is the fraction of $8 \lt z \lt 12$ candidates that are, in fact, lower-redshift galaxies, is 25%. The loss rate, the fraction of $8 \lt z \lt 12$ candidates that do not satisfy the colour criteria, is $\sim$ 42%.

Figure 3. The two-colour diagram for sources from the JADES DR3 matching COSMOS-Web limiting magnitudes. Scattered grey dots represent all objects that have both NIRCam and NIRSpec data. The green polygon delineates our F115W-dropout selection criteria. coloured stars are spectroscopically confirmed galaxies with $8.0\leq z_{\text{spec}}\leq12.0$ ; the over-plotted numbers mark their spectroscopic redshifts. Red/Blue colours mark galaxies that satisfy/fail the colour criteria. Black triangles indicate $z_{\text{spec}}\lt8.0$ that satisfy the colour criteria sources. The text annotations quote a contamination rate of 25% (ratio between galaxies with $z_{\text{spec}}\lt8.0$ that satisfy the colour criteria ( $N=5$ ) to the number of the source satisfy the colour criteria ( $N=20$ )) and a loss rate of 42% (fraction of $8.0\leq z_{\text{spec}}\leq12.0$ galaxies that do not satisfy the colour criteria ( $N=11$ ) to the number of all $8.0\leq z_{\text{spec}}\leq12.0$ galaxies ( $N=26$ ).

Also, at redshifts $z\geq11$ , the effect of the intergalactic medium (IGM) on observed F150W fluxes becomes increasingly severe. In particular, the damping wing of Lyman- $\alpha$ can extend well into the F150W filter, significantly suppressing the observed continuum flux beyond the nominal wavelength of the Lyman- $\alpha$ break. This results in diminished detectability for sources at $z\geq11$ . Consequently, our colour selection technique becomes less efficient at the highest redshifts, as the $z\geq11$ galaxy exhibits a F150W-F277W colour far beyond in Figure 3. Therefore, it is likely that some extremely high-redshift galaxies are missed due to IGM attenuation.

3.2 SED fitting and photometric redshifts

Here we perform SED analysis mainly with CIGALE (Boquien et al. Reference Boquien, Burgarella, Roehlly, Buat, Ciesla, Corre, Inoue and Salas2019) v2022.1, and test with EAZY (Brammer, van Dokkum, & Coppi Reference Brammer, van Dokkum and Coppi2008) as a comparison to the derived redshifts and galaxy properties. We integrate all the available photometric data from the JWST and HST in the COSMOS-Web field. This includes F814W, F115W, F150W, F277W, and F444W. Following the analysis in Section 3.1.1, we further compare the photometric redshift performances of these two codes using galaxies with spectroscopic redshifts in the JADES DR3 catalogue. We only used filters available in the COSMOS-Web and specifically downgraded JADES photometry to COSMOS-web depths as in Sections 3.2.1 and 3.2.2.

3.2.1 SED with CIGALE

CIGALE provides comprehensive modelling that allows us to more carefully evaluate how different physical assumptions (nebula emission, dust, etc.) might affect the inferred high-redshift solutions. We adopt the dustatt_modified_CF00 module for dust attenuation, single stellar population (SSP, Bruzual & Charlot Reference Bruzual and Charlot2003), and sfhdelayed module for star-forming history. Among the settings in the CIGALE, the photo-z are fitted from 0 to 12 with an increment of 0.06.

The strong emission-line galaxies at $z\lt6$ can be concerning interlopers that masquerade as $z\sim16$ Lyman-break galaxies (Arrabal Haro et al. Reference Arrabal Haro2023). In the sense that a potentially significant contribution from H $\beta$ +OIII to the F277W flux, and H $\alpha$ to the F444W flux for emission galaxies at $z\sim4-5$ . Therefore, including nebula emission in the CIGALE modelling should result in a better chance of distinguishing the interlopers.

To even shorten the time for calculating the best-fit model in CIGALE, we did not include AGN models, X-ray, and Radio parameters.

The photometric results of CIGALE come in two parts, with the best fit and Bayesian. The ratio between best-z and Bayesian-z of all of the remaining galaxy samples is shown in Figure 4. Since CIGALE does not calculate the error of the best-z, we adopt the 16 and 84-percentile of the PDF to present its lower-/upper error.

Figure 4. The relative percentage deviation of photo-z derived from CIGALE for the best-fit model and the Bayesian estimation. The colourmap indicates the number of sources in each hexagonal bin. Red scatter points represent the median of certain redshifts and the corresponding standard deviation. The Bayesian estimation does not deviate $\gt 7.5\%$ from the best-fit model across 8 $\leq z \leq$ 12 for most sources.

3.2.2 SED with EAZY

EAZY combines user-supplied templates to derive observed photometry for a given galaxy. The template set we used in EAZY includes the tweak_fsps_QSF_12_v3 set of 12 flexible stellar population synthesis (FSPS; Conroy & Gunn Reference Conroy and Gunn2010) templates recommended by the EAZY documentation. We also included the nine templates (Set1+3+4) designed for high-redshift galaxies from Larson et al. (Reference Larson2022), and seven JADES SED templates used by Hainline et al. (Reference Hainline2024).

To prevent the F814W flux upper limit from overly constraining the fits and to account for any photometric calibration uncertainties during the reprojecting, we set EAZY to have an additional photometry uncertainty of 10%. We did not adopt any apparent magnitude priors in the EAZY parameter to avoid dropping the faint sources.

Figure 5. Comparison between photometric and spectroscopic redshifts from CIGALE/EAZY and spectroscopic redshifts from JADES DR3. Grey points represent all galaxies with both photometry and spectroscopic redshifts in the JADES field. Note that we did not use filters not available in the COSMOS-Web, and the JADES photometry is downgraded to the COSMOS-Web quality. Green stars denote JADES galaxies that satisfy our F150W-dropout colour criteria. The solid black line indicates the one-to-one correspondence ( $z_{\mathrm{spec}} = z_{\mathrm{phot}}$ ), and the red points highlight sources within the dashed lines of 10% deviation. For the high-redshift subset ( $z_{\mathrm{spec}} \gt 8$ ), we find a significantly improved normalised median absolute deviation (NMAD) and outlier fraction. This figure demonstrates the performance of photometric redshift estimation under COSMOS-Web-like conditions.

In Figure 5, the resulting statistics included an estimate of the normalised median absolute deviation (NMAD), $\sigma_{\text{NMAD}}$ , and the 10% outlier rate, $\eta$ :

(2)

\begin{equation} \sigma_{\text{NMAD}} = 1.48 \times \text{median}(|\Delta z - \text{median}(\Delta z)|) \end{equation}

(3)

\begin{equation} \eta = \frac{N_{|\Delta z|\gt0.1z}}{N_{\text{tot}}} \end{equation}

As Figure 5 shows, both software yield similar results in terms of $\eta$ and $\sigma_{NMAD}$ , which equals 0.50(0.58) and 0.15(0.10), from CIGALE(EAZY). Considering that CIGALE could provide a simultaneous fitting of the physical parameters for the analysis, we use CIGALE throughout the manuscript.

To further demonstrate the robustness of photometric redshift from CIGALE, we followed Harikane et al. (Reference Harikane2023), Finkelstein et al. (Reference Finkelstein2023), and Hainline et al. (Reference Hainline2024), referring to the criteria for selecting a high-z solution is much preferred over the low-z solution by $\Delta\chi^2 = \chi^{2}_{(z\lt8)} - \chi^{2}_{(z\gt8)} \gt= 9$ . Figure 6 shows the SED result from CIGALE with the corresponding $\chi^2$ distribution for passed/failed galaxies.

Figure 6. CIGALE best-fit spectral energy distribution and redshift likelihood for galaxy. Left: The black curve shows the total model SED corresponding to the minimum- $\chi^2$ $z \gt 8$ solution. Coloured components indicate the attenuated stellar continuum (yellow), grey intrinsic stellar emission (blue dashed), nebular lines and continuum (green), and thermal dust emission (red). Magenta circles mark the observed NIRCam/IRAC fluxes, while the green triangle denotes a 2-sigma upper limit. We added the best $z_{\mathrm{phot}} \lt 8$ solution as a grey line for reference. Top-right: variation of $\chi^2$ with redshift. The sharp minimum at $z\approx9.3$ and the absence of significant secondary solutions ( $\Delta\chi^2 \leq 9$ ) at lower redshift $(z \lt 8)$ confirm the robustness of the high-z interpretation. Bottom-right: 2" $\times$ 2" cutouts in HST/ACS F814W and JWST/NIRCam F115W, F150W, F277W, and F444W (left to right). Red tick marks (0.5" in length) identify the target. The non-detections in F814W and F115W, coupled with clear detections long-ward of 1.5 $\unicode{x03BC}$ m, are consistent with a Lyman-break galaxy at $z\approx9.3$ .

The $\chi^{2}$ distribution is converted from the PDF of redshift from CIGALE with the same equation implemented in EAZY (Eq. 4).

(4)

\begin{equation} P(z) =e^{(\chi^{2} - \chi^{2}_{\min})/2} \end{equation}

We also selected sources that have the best photo-z solution occurring at $z\gt8$ . In total, this leaves us with a final sample of 366 sources.

3.3 Clustering of the galaxies

3.3.1 Weighted number density

Since the colours of the Balmer-break galaxies at a lower-z ( $z\sim2\!-\!3$ ) and Lyman- $\alpha$ -break galaxies (at $z\sim9$ ) have degeneracies, our galaxy sample could easily be contaminated by the low-z interlopers. To avoid confusion, we use the Bayesian analysis in CIGALE to retrieve the PDF of redshift to weigh the number density of galaxies at higher redshift.

In the total of 366 F115W-dropouts, we only found 10 galaxies that with $z_{\text{best}}\geq 11$ , and 13 galaxies with $z_{\text{best}}\leq 9$ , which are statistically insufficient to evaluate the number-/overdensity at has $z_{\text{best}}\leq 9$ and $z_{\text{best}}\geq 11$ . As a result, we probe the clustering of 343 galaxies to those located only between $9.0\leq z_{\text{best}} \leq11$ .

Since the median of the resulting error varies and is non-negligible ( $\pm\Delta z \sim$ 0.5), we decided to measure the local number density centred at each galaxy, integrating the PDFs of every galaxy within an aperture with a certain radius and weighing them correspondingly (Eq. 5). The detailed reasoning for the chosen radius is in the next section (Section 3.3.2). The redshift interval in the equation is centred on the best-fit redshift of the galaxy; we want to compute its local number density with the $\pm\Delta z=0.5$ redshift interval, which combined with the aperture, makes the local volume a cylinder. The length of the cylinder ( $\pm\Delta z\sim$ 0.5) is equivalent to a distance of $\sim 250$ cMpc at $z=9.0$ .

(5)

\begin{equation}W = \int_{z_{1}}^{z_{2}} P(z) \,dz\end{equation}

At the edge of the survey field, we also scale the number density by the fraction of the volume of the cylinder within the survey field. To avoid overestimating the number density at the edge of the survey area, we discarded the measurements that had a scale factor greater than 2.

The overdensity ( $\delta$ ) is calculated as Eq. (6), where $\rho(x)$ is the weighted number of galaxies in the given aperture. To estimate the uncertainty of the weighted galaxy number counts within each aperture, we performed a Monte Carlo sampling by placing 1 000 random apertures across the field and computing the weighted counts in each. The distribution of counts from these random apertures reflects the expected background fluctuations due to shot noise. The 16th and 84th percentiles of this non-Gaussian distribution are adopted as the lower and upper bounds of the 68% confidence interval. Suppose the measured count within the target aperture lies outside this interval. In that case, we adopt a one-sided error bar by setting the error in the direction beyond the sampled distribution to zero, thereby avoiding unphysical or misleading uncertainties.

(6)

\begin{equation}\delta = \frac{\rho(x)-\bar{\rho}}{\bar{\rho}} = \frac{\rho(x)}{\bar{\rho}} - 1\end{equation}

where $\bar{\rho}$ represents the average number density measured from 1 000 random positions across the field at the same redshift as the central galaxy.

3.3.2 Significance of clustering

For the previous search of protoclusters, many spectroscopically selected galaxies are targeted as candidates, which have a small physical separation ( $\lt3$ cMpc). However, according to the Millennium-based simulations, protoclusters would have different physical sizes (Yajima et al. Reference Yajima2022; Chiang et al. Reference Chiang, Overzier, Gebhardt and Henriques2017), which are related to the final mass at $z = 0$ (Muldrew et al. Reference Muldrew, Hatch and Cooke2015; Lovell, Thomas, & Wilkins Reference Lovell, Thomas and Wilkins2018). Such physical distances only include the very centre, or core, of the protocluster as such a distance universe.

This leads to an analysis of the effect of radius on investigating a range of possible scales of protoclusters and introduces significance ( $\sigma$ , Eq. 7) as a quantity to estimate the robustness of each overdensity region that would need to stand out above field fluctuations.

Therefore, we explore different sizes of apertures ( $R = 0.5-7.5$ cMpc) for a complete characterisation of protocluster environments at such high redshift. We adopt Eq.15 from Li et al. (Reference Li2025) for the definition of significance for a certain overdensity value. It is defined as the ratio between the difference of overdensity and its average ( $\mu_{\delta}$ ) to the standard deviation ( $\sigma_{\delta}$ ) of the distribution of the overdensity measured from the identical Monte Carlo sampling mentioned in Sections 3.3.1 at the same redshift as the central galaxy.

(7)

\begin{equation}\sigma = \frac{\delta - \mu_{\delta}}{\sigma_{\delta}}\end{equation}

Because each galaxy contributes a continuous membership probability, the count distribution follows a Poisson-binomial law. We therefore estimate uncertainties by the distribution from those random apertures and quote the difference between the 84th percentile and median as symmetric errors.

4. Results and discussion

4.1 The impact of aperture size on overdensity measurements

To quantitatively assess how aperture size influences overdensity measurements, we calculated $\delta$ for each galaxy using five representative radii: $R =0.5, 1.0, 2.5, 5.0$ , and 7.5 cMpc. These choices span the observed range of protocluster core sizes, from the very compact A2744z8OD ( $\sim$ 0.3 cMpc, $\delta\approx130$ Ishigaki, Ouchi, & Harikane Reference Ishigaki, Ouchi and Harikane2016) to more extended structures with physical sizes of 2–8 cMpc at $z\sim8$ (Trenti et al. Reference Trenti2012; Helton et al. Reference Helton2024b; Li et al. Reference Li2025), as well as the typical half-mass radii in simulations ( $\lesssim 10$ cMpc Chiang, Overzier, & Gebhardt Reference Chiang, Overzier and Gebhardt2013). The resulting overdensity distributions for each aperture size are shown in Figure 7.

Figure 7. Histograms of galaxy overdensity, $\delta$ , measured for all galaxies in the field at different aperture radii, $R = 0.5$ , 1.0, 2.5, 5.0, and 7.5 cMpc (from top to bottom, left to right), for two redshift intervals: (Left) $9 \lt z \lt 10$ , (Right) $10 \lt z \lt 11$ . Each panel shows the distribution of $\delta$ values, with the blue bars representing the number of galaxies. The shaded region indicates the 1 $\sigma$ range around the mean. The red dashed line marks the median, while the green dashed line shows the mean overdensity. The legend in each panel indicates the aperture size and the corresponding statistical values. Note that the overdensity distribution becomes narrower and shifts toward lower values as the aperture radius increases, reflecting the dilution of local enhancements over larger spatial scales.

As seen in Figure 7, the overdensity distributions for smaller apertures ( $R = 0.5$ and 1.0 cMpc) are strongly skewed toward high $\delta$ values, with medians of $\delta \approx$ 97.22 and 23.64, respectively. This is primarily due to small sampling volumes, which result in extreme $\delta$ values ( $\delta \gtrsim$ 20). In contrast, when using larger apertures ( $R = 2.5$ , 5.0, and 7.5 cMpc), the median $\delta$ values approach zero ( $\delta \approx$ 3.31, 1.10, and 0.70). With the even larger radius, 7.5 cMpc, all the densities are diluted to below 4 times denser than the field average ( $\delta \lt 3$ ).

Among these, we therefore found that a core aperture radius of 2.5 cMpc offers a balanced sensitivity: it is large enough to encompass significant physical structure, while still focusing on the densest core regions where the protocluster signature is most robust. Adopting smaller apertures tends to fragment physically connected structures, while larger apertures dilute the overdensity signal due to increasing inclusion of the surrounding field. Therefore, 2.5 cMpc is chosen as an optimal compromise for the identification of protocluster cores in our dataset, and we have adopted $R=2.5$ cMpc as the fiducial radius for calculating the local-/overdensity value on each galaxy.

4.2 Protocluster candidates

First, we identify the ‘core’ galaxies of protocluster candidates as those with overdensity ( $\delta$ ) values exceeding the 95th percentile of the $\delta$ distribution for all galaxies and with statistically significant peaks based on Monte Carlo sampling ( $\sigma \gt 3$ ). To assemble full protocluster candidates, we link galaxies located within a projected distance of 7.5 cMpc from these cores. Chiang et al. (Reference Chiang, Overzier, Gebhardt and Henriques2017) demonstrated that the probability of a galaxy being associated with the protocluster core drops sharply beyond this scale at $z \sim 7$ , which is why we use this linking length.

Additionally, if two core galaxies are within 7.5 cMpc, we merge them into a single larger structure. Due to the larger radius used to search for members (7.5 cMpc), we can still detect large structures, as demonstrated with $R = 5.0$ or 7.5 cMpc. The reported core candidates and members are identified at different scales to maximise our sensitivity to a range of protocluster morphologies and to ensure comparability with previous studies using various methodologies.

Figure 8 shows the most significant photometrically selected protocluster candidates in the COSMOSWeb DR 0.5 field of view. For illustrative purposes, we convolved a Gaussian kernel density estimation (KDE) with the overdensity colourmap. The $1\sigma$ width of the Gaussian kernel is set to 2.5 cMpc. The top panel shows the five most promising photometrically selected protocluster candidates in $9 \leq z \leq 10$ , labelled COSMOSWeb_PC-1, COSMOSWeb_PC-2, COSMOSWeb_PC-3, COSMOSWeb_PC-4, and COSMOSWeb_PC-5. The bottom panel shows the two significant detections of protocluster candidates, COSMOSWeb_PC-6 and COSMOSWeb_PC-7, at $10 \leq z \leq 11$ .

Figure 8. This shows the projected distribution of F115W dropout galaxies across the COSMOS-Web footprint. Black circles mark all sources whose best-fit photometric redshifts satisfy the range $9\le z_{\mathrm{best}}\le10$ (or $10\le z_{\mathrm{best}}\le11$ in the bottom panel). The size of each scatter point represents the significance ( $\sigma$ ) of its overdensity value compared to the Monte Carlo sampling. A Gaussian-kernel surface-density map (shown in green-to-blue shading) highlights significant overdensities. Within each overdense peak, galaxies with overdensities greater than the 95th percentile of the entire distribution ( $\delta\ge\delta_{95}$ ) and robust enough ( $\sigma\gt3$ ) over the Monte Carlo sampling are designated a protocluster core candidate (red star), while blue stars indicate galaxies lying inside an $R=7.5$ cMpc aperture centred on the cores. The red polygon outlines the NIRCam field of view, and the grey background denotes regions outside the imaging coverage.

Figure 9. Left: JWST/NIRCam three-colour mosaic (B: F814W+F115W, G:F150W+F277W, R:F444W) centred on the COSMOS-Web protocluster candidate COSMOS-Web_PC-1 ( $z_{\mathrm{phot}}\simeq 9.3$ ). Red stars mark the positions of member/core galaxies. Solid frames identify candidate members that have higher probabilities ( $W\gt0.5$ ) associated with the cores, while dashed frames mark lower probabilities ( $0.025\lt W \leq0.5$ ) sources. Each cutout has a Field-of-View of 4" $\times$ 4". The white bar in the lower-left corner corresponds to 54" (2.5 cMpc) at $z\simeq 9.5$ . North is up, and east is to the left. Right: The PDFs of cores/potential members for the protocluster candidate COSMOS-Web_PC-1. Solid/Dashed lines denote those solid/dashed frame sources in the left panel.

Figure 10. Same as Figure 9, but for COSMOSWeb_PC-2.

Figure 11. Same as Figure 9, but for COSMOSWeb_PC-3.

The detailed false-colour cutouts and the probability density function (PDF) for the members of each protocluster are shown in the left panel of Figures 9–15. Solid PDFs are more likely to be associated with the central galaxy $(p \gt 50\%)$ , whereas the galaxies have a lower probability $(2.5\% \lt p \lt 50\%)$ of being associated with the core galaxies indicated by dashed lines (corresponding to $\lt2\sigma$ if assuming a Gaussian distribution). The spatial clustering and photometric redshift coherence suggest that these five highlighted groups are promising protoclusters at cosmic dawn. We summarise the position, overdensity, significance, photometry, and stellar mass of each member of the reported protocluster candidates in Table 1.

Figure 12. Same as Figure 9, but for COSMOSWeb_PC-4.

Figure 13. Same as Figure 9, but for COSMOSWeb_PC-5.

Figure 14. Same as Figure 9, but for COSMOSWeb_PC-6.

Figure 15. Same as Figure 9, but for COSMOSWeb_PC-7.

4.2.1 Physical size of protocluster candidates

The physical size of the forming galaxy (protocluster), i.e. the region expected to contain at least 50% of the descendant mass according to the half-mass radius statistics of cosmological simulations (Chiang, Overzier, & Gebhardt Reference Chiang, Overzier and Gebhardt2013), is suggested to be less than 10 cMpc. However, recent searches for protoclusters are mainly limited by the survey area, which could lead to the larger half-mass radius size of 7–10 cMpc being incomplete. The overdensity reported by Helton et al. (Reference Helton2024b) was obtained within a volume of 15 cMpc $^3$ , with an NIRCam Wide Field Slitless Spectroscopy (WFSS) survey that covered only $\approx 62$ arcmin $^2$ . Two overdensities containing four and six galaxies were found at $z \sim 8$ , with physical sizes of 3.9 and 5.8 cMpc, respectively. Trenti et al. (Reference Trenti2012) also identified an overdensity ( $\sigma \approx$ 8.9, relative to the field mean) at $z \approx 8$ in the BoRG survey, using a search box of $2.83 \times 2.83$ cMpc $^2$ (corresponding to 62" $\times$ 62" at $z = 8.0$ ). By contrast, our search method benefits from a wider survey area. Using an aperture radius of 7.5 cMpc, we can cover larger structures with sizes of $\sim15$ cMpc.

We estimate the size of each protocluster candidate by identifying the members that are most separated in right ascension and declination, with depth defined as the maximum difference in best-fit redshifts among highly associated members. COSMOSWeb_PC-1 ( $\delta_\textrm{max} = 9.04$ , $\sigma_\textrm{max} = 4.27$ , Figure 9), this structure comprises five galaxies at $z \sim 9.3$ . The overdensity extends across a volume of approximately $2.8 \times 7.4 \times 243.7$ cMpc. COSMOSWeb_PC-2 ( $\delta_\textrm{max} = 9.13$ , $\sigma_\textrm{max} = 3.65$ , Figure 10) contains eight galaxies at $z \sim 9.9$ . We estimate the total volume of this protocluster to be $5.5 \times 12.0 \times 226.8$ cMpc. COSMOSWeb_PC-3 ( $\delta_\textrm{max} = 8.73$ , $\sigma_\textrm{max} = 4.20$ , Figure 11) contains ten galaxies at $z \sim 9.2$ . We estimate the total volume of this protocluster to be $11.0 \times 9.1 \times 241.9$ cMpc. COSMOSWeb_PC-4 ( $\delta_\textrm{max} = 8.12$ , $\sigma_\textrm{max} = 3.63$ , Figure 12) contains seven galaxies at $z \sim 9.2$ . We estimate the total volume of this protocluster to be $14.7 \times 6.4 \times 358.2$ cMpc. COSMOSWeb_PC-5 ( $\delta_\textrm{max} = 8.56$ , $\sigma_\textrm{max} = 4.41$ , Figure 13) contains nine galaxies at $z \sim 9.3$ . We estimate the total volume of this protocluster to be $9.3 \times 16.6 \times 225.2$ cMpc. COSMOSWeb_PC-6 ( $\delta_\textrm{max} = 10.96$ , $\sigma_\textrm{max} = 3.78$ , Figure 14) contains five galaxies at $z \sim 10.3$ . We estimate the total volume of this protocluster to be $8.2 \times 4.0 \times 206.2$ cMpc. COSMOSWeb_PC-7 ( $\delta_\textrm{max} = 8.80$ , $\sigma_\textrm{rmax} = 3.16$ , Figure 15) contains five galaxies at $z \sim 10.0$ . We estimate the total volume of this protocluster to be $13.1 \times 1.9 \times 211.7$ cMpc.

4.3 Stellar/Halo mass of candidates

It is important to note that the linking length used to associate protocluster members and the aperture used to define protocluster cores do not have to be the same. Therefore, when interpreting the data physically, such as when estimating the mass or rarity of protoclusters, it is important to use a radius comparable to the expected size of the protocluster at the relevant redshift ( $z \gt 9$ ). Here, we adopt a radius of 7.5 cMpc when summing up the stellar masses for members of these cores, as implied by Chiang et al. (Reference Chiang, Overzier, Gebhardt and Henriques2017).

We convert the stellar mass of each protocluster candidate to the estimated halo mass using the empirical relation from the semi-analytic simulation (Behroozi et al. Reference Behroozi, Wechsler, Hearin and Conroy2019). However, they only show the relation up to $z = 8$ , so we linearly extrapolate the evolution track from $z = 7$ to 8 and derive the halo-stellar mass ratios at $z \sim 9$ . Because these galaxies live in subhalos that are still merging into a larger, not fully virialised structure, we estimate the total protocluster halo mass by summing the stellar masses of all member galaxies (from SED fitting) and then applying a halo-stellar mass ratio calibrated at the group/cluster scale, rather than treating each galaxy’s halo individually. The extrapolated ratio sits between 0.012 and 0.004, which is also included in the upper- or lower errors in Figure 16.

Table 1. The basic information of the member galaxies in each protocluster candidate.

We also compare the halo mass as a function of the redshift in the previous literature at $z \gt 5$ (Trenti et al. Reference Trenti2012; Chanchaiworawit et al. Reference Chanchaiworawit2019; Calvi et al. Reference Calvi, Dannerbauer, Haro, Rodrguez Espinosa, Muñoz-Tuñón, Pérez González and Geier2021; Harikane et al. Reference Harikane2019; Laporte et al. Reference Laporte, Zitrin, Dole, Roberts-Borsani, Furtak and Witten2022; Helton et al. Reference Helton2024b). The halo mass inferred from Laporte et al. (Reference Laporte, Zitrin, Dole, Roberts-Borsani, Furtak and Witten2022) is given with (without) the magnification of the foreground galaxy cluster, which becomes $M_{h}/M_{\odot} = 3.34^{+0.59}_{-0.50} \times 10^{11}$ ( $3.6^{+13.3}_{-2.8} \times 10^{11}$ ). Helton et al. (Reference Helton2024b) use the simulation result from UniverseMachine to derive the relation of stellar-to-halo mass for their sample ( $11.5 \lt M_{h}/M_{\odot} \lt 13.4$ ). The halo mass of their most distant protocluster ( $z_{\text{spec}} = 7.954$ and $8.222$ ) ranges $11.5 \lt M_{h}/M_{\odot} \lt 12.0$ as a decreasing trend toward higher redshifts. The brightest galaxy in the $z = 8.5$ protocluster from BoRG survey has a halo mass of $M_{h}/M_{\odot} = 4-7 \times 10^{11}$ (Trenti et al. Reference Trenti2012).

The total halo mass for each protocluster candidate obtained by combining each galaxy spans from $10^{10.62}$ to $10^{11.39}\,{\rm M}_{\odot}$ for our candidates and is shown in Figure 16. In this sense, our estimated halo masses are comparable to or exceed the inferred progenitor masses of present-day Coma-like clusters (i.e. $M_{h} \gt 10^{15}\,{\rm M}_{\odot}$ at $z = 0$ ), suggesting that some of these overdense regions may evolve into similarly massive systems, though a wide range of evolutionary outcomes remains possible.

5. Conclusions

1. We found seven protocluster candidates between $9 \leq z \leq 11$ , as galaxies within overdense regions with $\delta$ values exceeding the 95th percentile of the $\delta$ distribution for 366 F115W dropout galaxies. Each region has statistically significant peaks based on Monte Carlo sampling ( $\sigma \gt$ 3). Because of the large area coverage of the COSMOS-Web, this work presents the largest number of protocluster candidates at $z\gtrsim 9-10$ to date.
2. The size of our protocluster candidates ranges $3-4'$ ( $\sim$ 15 cMpc, at $z = 9.3$ ). These values are comparable to the potential physical sizes of protoclusters by simulations from (Chiang et al. Reference Chiang, Overzier, Gebhardt and Henriques2017) ( $\lesssim10$ cMpc). However, observations of protocluster candidates at scales of less than 10 cMpc, such as those from Helton et al. (Reference Helton2024b) at $z=8.22$ and Trenti et al. (Reference Trenti2012) at $z=8.0$ , would mainly target the central galaxies of the protoclusters at $z\geq8$ .
3. We estimate halo masses of protocluster candidates at $z \sim 9-10$ by extrapolating the Behroozi et al. Reference Behroozi, Wechsler, Hearin and Conroy2019) stellar-to-halo mass relation (assume extra 10% uncertainty), deriving total halo masses as $10^{10.62-11.39}\,{\rm M}_{\odot}$ of our candidates including 1 $\sigma$ error; this suggests that all the overdense regions may evolve into Coma-like clusters $M_{h} \gt 10^{15}\,{\rm M}_{\odot}$ at $z = 0$ .

Figure 16. The halo mass of protoclusters as a function of redshift. The area between the grey lines should be the predicted evolution track of a massive cluster from the semi-analytic model by Chiang, Overzier, & Gebhardt (Reference Chiang, Overzier and Gebhardt2013), whereas the dashed line is its linear extrapolation. The data points are the observed protoclusters from Trenti et al. (Reference Trenti2012), Chanchaiworawit et al. (Reference Chanchaiworawit2019), Calvi et al. (Reference Calvi, Dannerbauer, Haro, Rodrguez Espinosa, Muñoz-Tuñón, Pérez González and Geier2021), Harikane et al. (Reference Harikane2019), Laporte et al. (Reference Laporte, Zitrin, Dole, Roberts-Borsani, Furtak and Witten2022), Helton et al. (Reference Helton2024b). The red stars indicate the seven overdense regions of our work. The dashed line illustrates the typical threshold mass for a stable shock in a spherical infall. Below this threshold, the flows are predominantly cold, while above it, a shock-heated medium is present (Dekel & Birnboim Reference Dekel and Birnboim2006).

These overdensities, previously unexplored in the literature, offer a unique opportunity to advance our understanding of the formation and early evolution of protoclusters at $z\sim 9$ . All of these findings still require NIRCam spectroscopic follow-up to confirm (or reject) the physical association of the candidate members. Shortly, COSMOS-3D (GO 5893; PI Kakiichi) will have NIRCam WFSS grism observations with F444W for the COSMOS field. These grism data would further confirm the protoclusters and high-z candidates spectroscopically, especially the F444W could recover the important H $\beta$ and [OIII] $\lambda$ 5007 lines for galaxies at $z\sim7-9$ . If any of the candidates are confirmed spectroscopically, their possible evolution in different environments would examine their role in shaping the large-scale structure of the Universe during the epoch of reionisation.

Acknowledgement

The authors express their gratitude to the anonymous referee for the very constructive and insightful comments that have significantly improved the quality of this manuscript. The authors also express their gratitude to the staff in the JWST Helpdesk for their support throughout the process. TG acknowledges the support of the National Science and Technology Council of Taiwan through grants 113-2112-M-007-006-, 113-2927-I-007-501-, and 113-2123-M-001-008-. TH acknowledges the support of the National Science and Technology Council of Taiwan through grants 110-2112-M-005-013-MY3, 110-2112-M-007-034-, and 112-2123-M-001-004-. SH acknowledges the support of the Australian Research Council (ARC) Centre of Excellence (CoE) for Gravitational Wave Discovery (OzGrav) project numbers CE170100004 and CE230100016, and the ARC CoE for All Sky Astrophysics in 3 Dimensions (ASTRO 3D) project number CE170100013.

This work is based on observations made with the NASA/ ESA/CSA James Webb Space Telescope. The data were obtained from the Mikulski Archive for Space Telescopes at the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Inc., under NASA contract NAS 5-03127 for JWST.

This work used high-performance computing facilities operated by the Centre for Informatics and Computation in Astronomy (CICA) at National Tsing Hua University. This equipment was funded by the Ministry of Education of Taiwan, the National Science and Technology Council of Taiwan, and the National Tsing Hua University.

Data availability statement

The COSMOS-Web DR0.5 is publicly available at https://cosmos.astro.caltech.edu/page/cosmosweb-dr. The JADES data used in this study are publicly available through the Mikulski Archive for Space Telescopes (MAST) as part of the High-Level Science Products (HLSP) release: https://archive.stsci.edu/hlsp/jades (DOI: 10.17909/8tdj-8n28e). The JADES DR3 catalogue is available at https://jades-survey.github.io/scientists/data.html. All processed data products and analysis scripts used in this work will be made available upon reasonable request to the corresponding author.

Footnotes

^a https://cosmos.astro.caltech.edu/page/cosmosweb-dr.

^b https://jwst-docs.stsci.edu/jwst-near-infrared-camera/nircam-performance/nircam-absolute-flux-calibration-and-zeropoints##gsc.tab=0.

^c https://jwst-crds.stsci.edu/browse/jwst_nircam_apcorr_0004.fits.

References

Adam, R., et al. 2016, 596, A108. issn: 1432-0746. https://doi.org/10.1051/0004-6361/201628897.CrossRef Google Scholar

Arrabal Haro, P. et al. 2023, Nat, 622, 707. https://doi.org/10.1038/s41586-023-06521-7.CrossRef Google Scholar

Behroozi, P., Wechsler, R. H., Hearin, A. P., & Conroy, C. 2019, MNRAS, 488, 3143. issn: 0035-8711. https://doi.org/10.1093/mnras/stz1182.CrossRef Google Scholar

Bertin, E. 2010, Astrophysics Source Code Library, record ascl:1010.068, October.Google Scholar

Bertin, E., & Arnouts, S. 1996, AAS, 117, 393. https://doi.org/10.1051/aas:1996164.CrossRef Google Scholar

Boquien, M., Burgarella, D., Roehlly, Y., Buat, V., Ciesla, L., Corre, D., Inoue, A. K., & Salas, H. 2019, A&A, 622, A103. https://doi.org/10.1051/0004-6361/201834156. arXiv:1811.03094[astro-ph.GA].CrossRef Google Scholar

Brammer, G. B., van Dokkum, P. G., & Coppi, P. 2008, ApJ, 686, 1503. https://doi.org/10.1086/591786. arXiv:0807.1533 [astro-ph].CrossRef Google Scholar

Brinch, M., et al. 2024, MNRAS, 527, 6591. https://doi.org/10.1093/mnras/stad3409. arXiv:2311.00511 [astro-ph.GA].CrossRef Google Scholar

Bruzual, G., & Charlot, S. 2003, MNRAS, 344, 1000. https://doi.org/10.1046/j.1365-8711.2003.06897.x. arXiv:astro-ph/0309134 [astro-ph].CrossRef Google Scholar

Bushouse, H., et al. 2022. https://doi.org/10.5281/zenodo.7429939.CrossRef Google Scholar

Calvi, R., Dannerbauer, H., Haro, P. A., Rodrguez Espinosa, J. M., Muñoz-Tuñón, C., Pérez González, P. G., & Geier, S. 2021, MNRAS, 502, 4558. https://doi.org/10.1093/mnras/staa4037. arXiv:2101.02747 [astro-ph.GA].CrossRef Google Scholar

Casey, C. M., et al. 2023, ApJ, 954, 31. https://doi.org/10.3847/1538-4357/acc2bc. arXiv:2211.07865 [astro-ph.GA].CrossRef Google Scholar

Chanchaiworawit, K., et al. 2019, ApJ, 877, 51. https://doi.org/10.3847/1538-4357/ab1a34.CrossRef Google Scholar

Chen, Z., Stark, D. P., Mason, C., Topping, M. W., Whitler, L., Tang, M., Endsley, R., & Charlot, S. 2024, MNRAS, 528, 7052. issn: 0035-8711. https://doi.org/10.1093/mnras/stae455.CrossRef Google Scholar

Chiang, Y.-K., Overzier, R., & Gebhardt, K. 2013, ApJ, 779, 127. https://doi.org/10.1088/0004-637X/779/2/127. arXiv:1310.2938 [astro-ph.CO].CrossRef Google Scholar

Chiang, Y.-K., Overzier, R. A., Gebhardt, K., & Henriques, B. 2017, ApJL, 844, L23. https://doi.org/10.3847/2041-8213/aa7e7b.CrossRef Google Scholar

Conroy, C., & Gunn, J. E. 2010, ApJ, 712, 833. https://doi.org/10.1088/0004-637X/712/2/833. arXiv:0911.3151 [astro-ph.CO].CrossRef Google Scholar

D’Eugenio, F., et al. 2025, ApJS, 277, 4. https://doi.org/10.3847/1538-4365/ada148. arXiv:2404.06531 [astro-ph.GA].CrossRef Google Scholar

Dekel, A., & Birnboim, Y. 2006, MNRAS, 368, 2. https://doi.org/10.1111/j.1365-2966.2006.10145.x. arXiv:astro-ph/0412300 [astro-ph].CrossRef Google Scholar

Eisenstein, D. J., et al. 2023. https://doi.org/10.48550/arXiv.2306.02465.CrossRef Google Scholar

Finkelstein, S. L., et al. 2023. ApJ, 946, L13. https://doi.org/10.3847/2041-8213/acade4. arXiv:2211.05792 [astro-ph.GA].CrossRef Google Scholar

Fudamoto, Y., et al. 2025. https://doi.org/10.48550/arXiv.2503.15597.CrossRef Google Scholar

Gardner, J. P., et al. 2023, PASP, 135, 068001. https://doi.org/10.1088/1538-3873/acd1b5. arXiv:2304.04869 [astro-ph.IM].CrossRef Google Scholar

Hainline, K. N., et al. 2024, ApJ, 964, 71. https://doi.org/10.3847/1538-4357/ad1ee4. arXiv:2306.02468 [astro-ph.GA].CrossRef Google Scholar

Harikane, Y., et al. 2019, ApJ, 883, 142. https://doi.org/10.3847/1538-4357/ab2cd5. arXiv:1902.09555 [astro-ph.GA].CrossRef Google Scholar

Harikane, Y., et al. 2022, ApJ, 929, 1. https://doi.org/10.3847/1538-4357/ac53a9. arXiv:2112.09141 [astro-ph.GA].CrossRef Google Scholar

Harikane, Y., et al. 2023, ApJS, 265, 5. https://doi.org/10.3847/1538-4365/acaaa9.CrossRef Google Scholar

Helton, J. M., et al. 2024a, ApJ, 962, 124. https://doi.org/10.3847/1538-4357/ad0da7. arXiv:2302.10217 [astro-ph.GA].CrossRef Google Scholar

Helton, J. M., et. al. 2024b, ApJ, 974, 41. https://doi.org/10.3847/1538-4357/ad6867.CrossRef Google Scholar

Higuchi, R., et al. 2019, ApJ, 879, 28. https://doi.org/10.3847/1538-4357/ab2192. arXiv:1801.00531 [astro-ph.GA].CrossRef Google Scholar

Hu, W., et al. 2021, NatAs, 5, 485. https://doi.org/10.1038/s41550-020-01291-y. arXiv:2101.10204 [astro-ph.GA].Google Scholar

Ishigaki, M., Ouchi, M., & Harikane, Y. 2016, ApJ, 822, 5. https://doi.org/10.3847/0004-637X/822/1/5. arXiv:1509.01751 [astro-ph.GA].CrossRef Google Scholar

Koekemoer, A. M., et al. 2011, ApJS, 197, 36. https://doi.org/10.1088/0067-0049/197/2/36. arXiv:1105.3754 [astro-ph.CO].CrossRef Google Scholar

Laporte, N., Zitrin, A., Dole, H., Roberts-Borsani, G., Furtak, L. J., & Witten, C. 2022, A&A, 667, L3. https://doi.org/10.1051/0004-6361/202244719. arXiv:2208.04930 [astro-ph.GA].CrossRef Google Scholar

Larson, R. L., et al. 2022, ApJ, 930, 104. https://doi.org/10.3847/1538-4357/ac5dbd. arXiv:2203.08461 [astro-ph.GA].CrossRef Google Scholar

Li, Q., et al. 2025, MNRAS, 539, 1796. https://doi.org/10.1093/mnras/staf543. arXiv:2405.17359 [astro-ph.GA].CrossRef Google Scholar

Lovell, C. C., Thomas, P. A., & Wilkins, S. M. 2018, MNRAS, 474, 4612. https://doi.org/10.1093/mnras/stx3090. arXiv:1710.02148 [astro-ph.GA].CrossRef Google Scholar

Muldrew, S. I., Hatch, N. A., & Cooke, E. A. 2015, MNRAS, 452, 2528. issn: 0035-8711. https://doi.org/10.1093/mnras/stv1449.CrossRef Google Scholar

Polletta, M., et al. 2021, A&A, 654, A121. https://doi.org/10.1051/0004-6361/202140612. arXiv:2109.04396 [astro-ph.GA].Google Scholar

Trenti, M., et al. 2012, ApJ, 746, 55. https://doi.org/10.1088/0004-637X/746/1/55. arXiv:1110.0468 [astro-ph.CO].CrossRef Google Scholar

Wang, F., et al. 2024, ApJ, 962, L11. https://doi.org/10.3847/2041-8213/ad20ef. arXiv:2402.01844 [astro-ph.GA].CrossRef Google Scholar

Weaver, J. R., et al. 2022, ApJS, 258, 11. https://doi.org/10.3847/1538-4365/ac3078. arXiv:2110.13923[astro-ph.GA].CrossRef Google Scholar

White, S. D. M., & Rees, M. J. 1978, MNRAS, 183, 341. https://doi.org/10.1093/mnras/183.3.341.CrossRef Google Scholar

Yajima, H., et al. 2022, MNRAS, 509, 4037. https://doi.org/10.1093/mnras/stab3092. arXiv:2011.11663 [astro-ph.GA].CrossRef Google Scholar

Figure 1. The two-colour diagram for the sources have $S/N \gt 2$ in F115W+F277W+F444W detection image of COSMOS-Web DR0.5 (Casey et al. 2023). The green area shows colours that satisfy the F115W-dropout criteria from Harikane et al. (2023). The colour is measured with a 0.3" diameter circular aperture. The numbers indicate how many sources there are in each subset.

Figure 2. Spectral energy distribution (SED) of the spectroscopically confirmed galaxy GN-z11 at redshift $z=10.60$ (black line), overlaid with the 5$\sigma$ detection limits of various filters used in this study. The coloured upward arrows indicate 5$\sigma$ limiting depths in each band: F814W (purple), F115W (light blue), F150W (green), F277W (orange), and F444W (red). For each band, the fainter (transparent) arrows denote the shallower 5$\sigma$ depths reached in approximately 50% of the survey area, due to non-uniform coverage and exposure time. Fluxes are shown in nJy on the left y-axis, with the corresponding AB magnitudes on the right y-axis. Horizontal error bars represent the approximate width of each filter’s transmission curve. This figure illustrates the ability of the JWST NIRCam bands to probe the rest-frame UV-to-optical emission of galaxies at $z\gt10$.

Figure 3. The two-colour diagram for sources from the JADES DR3 matching COSMOS-Web limiting magnitudes. Scattered grey dots represent all objects that have both NIRCam and NIRSpec data. The green polygon delineates our F115W-dropout selection criteria. coloured stars are spectroscopically confirmed galaxies with $8.0\leq z_{\text{spec}}\leq12.0$; the over-plotted numbers mark their spectroscopic redshifts. Red/Blue colours mark galaxies that satisfy/fail the colour criteria. Black triangles indicate $z_{\text{spec}}\lt8.0$ that satisfy the colour criteria sources. The text annotations quote a contamination rate of 25% (ratio between galaxies with $z_{\text{spec}}\lt8.0$ that satisfy the colour criteria ($N=5$) to the number of the source satisfy the colour criteria ($N=20$)) and a loss rate of 42% (fraction of $8.0\leq z_{\text{spec}}\leq12.0$ galaxies that do not satisfy the colour criteria ($N=11$) to the number of all $8.0\leq z_{\text{spec}}\leq12.0$ galaxies ($N=26$).

Figure 5. Comparison between photometric and spectroscopic redshifts from CIGALE/EAZY and spectroscopic redshifts from JADES DR3. Grey points represent all galaxies with both photometry and spectroscopic redshifts in the JADES field. Note that we did not use filters not available in the COSMOS-Web, and the JADES photometry is downgraded to the COSMOS-Web quality. Green stars denote JADES galaxies that satisfy our F150W-dropout colour criteria. The solid black line indicates the one-to-one correspondence ($z_{\mathrm{spec}} = z_{\mathrm{phot}}$), and the red points highlight sources within the dashed lines of 10% deviation. For the high-redshift subset ($z_{\mathrm{spec}} \gt 8$), we find a significantly improved normalised median absolute deviation (NMAD) and outlier fraction. This figure demonstrates the performance of photometric redshift estimation under COSMOS-Web-like conditions.

Figure 6. CIGALE best-fit spectral energy distribution and redshift likelihood for galaxy. Left: The black curve shows the total model SED corresponding to the minimum-$\chi^2$$z \gt 8$ solution. Coloured components indicate the attenuated stellar continuum (yellow), grey intrinsic stellar emission (blue dashed), nebular lines and continuum (green), and thermal dust emission (red). Magenta circles mark the observed NIRCam/IRAC fluxes, while the green triangle denotes a 2-sigma upper limit. We added the best $z_{\mathrm{phot}} \lt 8$ solution as a grey line for reference. Top-right: variation of $\chi^2$ with redshift. The sharp minimum at $z\approx9.3$ and the absence of significant secondary solutions ($\Delta\chi^2 \leq 9$) at lower redshift $(z \lt 8)$ confirm the robustness of the high-z interpretation. Bottom-right: 2" $\times$ 2" cutouts in HST/ACS F814W and JWST/NIRCam F115W, F150W, F277W, and F444W (left to right). Red tick marks (0.5" in length) identify the target. The non-detections in F814W and F115W, coupled with clear detections long-ward of 1.5 $\unicode{x03BC}$m, are consistent with a Lyman-break galaxy at $z\approx9.3$.

Figure 7. Histograms of galaxy overdensity, $\delta$, measured for all galaxies in the field at different aperture radii, $R = 0.5$, 1.0, 2.5, 5.0, and 7.5 cMpc (from top to bottom, left to right), for two redshift intervals: (Left) $9 \lt z \lt 10$, (Right) $10 \lt z \lt 11$. Each panel shows the distribution of $\delta$ values, with the blue bars representing the number of galaxies. The shaded region indicates the 1$\sigma$ range around the mean. The red dashed line marks the median, while the green dashed line shows the mean overdensity. The legend in each panel indicates the aperture size and the corresponding statistical values. Note that the overdensity distribution becomes narrower and shifts toward lower values as the aperture radius increases, reflecting the dilution of local enhancements over larger spatial scales.

Figure 8. This shows the projected distribution of F115W dropout galaxies across the COSMOS-Web footprint. Black circles mark all sources whose best-fit photometric redshifts satisfy the range $9\le z_{\mathrm{best}}\le10$ (or $10\le z_{\mathrm{best}}\le11$ in the bottom panel). The size of each scatter point represents the significance ($\sigma$) of its overdensity value compared to the Monte Carlo sampling. A Gaussian-kernel surface-density map (shown in green-to-blue shading) highlights significant overdensities. Within each overdense peak, galaxies with overdensities greater than the 95th percentile of the entire distribution ($\delta\ge\delta_{95}$) and robust enough ($\sigma\gt3$) over the Monte Carlo sampling are designated a protocluster core candidate (red star), while blue stars indicate galaxies lying inside an $R=7.5$ cMpc aperture centred on the cores. The red polygon outlines the NIRCam field of view, and the grey background denotes regions outside the imaging coverage.

Figure 9. Left:JWST/NIRCam three-colour mosaic (B: F814W+F115W, G:F150W+F277W, R:F444W) centred on the COSMOS-Web protocluster candidate COSMOS-Web_PC-1 ($z_{\mathrm{phot}}\simeq 9.3$). Red stars mark the positions of member/core galaxies. Solid frames identify candidate members that have higher probabilities ($W\gt0.5$) associated with the cores, while dashed frames mark lower probabilities ($0.025\lt W \leq0.5$) sources. Each cutout has a Field-of-View of 4"$\times$4". The white bar in the lower-left corner corresponds to 54" (2.5 cMpc) at $z\simeq 9.5$. North is up, and east is to the left. Right: The PDFs of cores/potential members for the protocluster candidate COSMOS-Web_PC-1. Solid/Dashed lines denote those solid/dashed frame sources in the left panel.

Figure 10. Same as Figure 9, but for COSMOSWeb_PC-2.

Figure 11. Same as Figure 9, but for COSMOSWeb_PC-3.

Figure 12. Same as Figure 9, but for COSMOSWeb_PC-4.

Figure 13. Same as Figure 9, but for COSMOSWeb_PC-5.

Figure 14. Same as Figure 9, but for COSMOSWeb_PC-6.

Figure 15. Same as Figure 9, but for COSMOSWeb_PC-7.

Table 1. The basic information of the member galaxies in each protocluster candidate.

Figure 16. The halo mass of protoclusters as a function of redshift. The area between the grey lines should be the predicted evolution track of a massive cluster from the semi-analytic model by Chiang, Overzier, & Gebhardt (2013), whereas the dashed line is its linear extrapolation. The data points are the observed protoclusters from Trenti et al. (2012), Chanchaiworawit et al. (2019), Calvi et al. (2021), Harikane et al. (2019), Laporte et al. (2022), Helton et al. (2024b). The red stars indicate the seven overdense regions of our work. The dashed line illustrates the typical threshold mass for a stable shock in a spherical infall. Below this threshold, the flows are predominantly cold, while above it, a shock-heated medium is present (Dekel & Birnboim 2006).

Article contents

Photometrically selected protocluster candidates at $z\sim 9-10$ in the JWST COSMOS-Web field

Abstract

Keywords

Information

1. Introduction

2. Data

2.1 Data reduction

2.2 Multi-wavelength photometric catalogue

3. Methodology

3.1 Colour selection

3.1.1 Comparison with JADES

3.2 SED fitting and photometric redshifts

3.2.1 SED with CIGALE

3.2.2 SED with EAZY

3.3 Clustering of the galaxies

3.3.1 Weighted number density

3.3.2 Significance of clustering

4. Results and discussion

4.1 The impact of aperture size on overdensity measurements

4.2 Protocluster candidates

4.2.1 Physical size of protocluster candidates

4.3 Stellar/Halo mass of candidates

5. Conclusions

Acknowledgement

Data availability statement

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests