Hostname: page-component-745bb68f8f-b95js Total loading time: 0 Render date: 2025-02-04T11:16:58.105Z Has data issue: false hasContentIssue false

Strategy for mitigation of systematics for EoR experiments with the Murchison Widefield Array

Published online by Cambridge University Press:  04 December 2024

Chuneeta D. Nunhokee*
Affiliation:
International Centre for Radio Astronomy Research (ICRAR), Curtin University, Bentley, WA, Australia ARC Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO 3D), Bentley, Australia Curtin Institute of Radio Astronomy, Perth, WA, Australia
Dev Null
Affiliation:
International Centre for Radio Astronomy Research (ICRAR), Curtin University, Bentley, WA, Australia Curtin Institute of Radio Astronomy, Perth, WA, Australia
Cathryn Trott
Affiliation:
International Centre for Radio Astronomy Research (ICRAR), Curtin University, Bentley, WA, Australia ARC Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO 3D), Bentley, Australia Curtin Institute of Radio Astronomy, Perth, WA, Australia
Christopher Jordan
Affiliation:
International Centre for Radio Astronomy Research (ICRAR), Curtin University, Bentley, WA, Australia ARC Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO 3D), Bentley, Australia Curtin Institute of Radio Astronomy, Perth, WA, Australia
Jack Laurence Bramble Line
Affiliation:
International Centre for Radio Astronomy Research (ICRAR), Curtin University, Bentley, WA, Australia ARC Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO 3D), Bentley, Australia Curtin Institute of Radio Astronomy, Perth, WA, Australia
Randall Bruce Wayth
Affiliation:
International Centre for Radio Astronomy Research (ICRAR), Curtin University, Bentley, WA, Australia ARC Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO 3D), Bentley, Australia Curtin Institute of Radio Astronomy, Perth, WA, Australia
Nichole Barry
Affiliation:
International Centre for Radio Astronomy Research (ICRAR), Curtin University, Bentley, WA, Australia ARC Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO 3D), Bentley, Australia Curtin Institute of Radio Astronomy, Perth, WA, Australia
*
Corresponding author: Chuneeta D. Nunhokee, Email: ridhima.nunhokee@curtin.edu.au
Rights & Permissions [Opens in a new window]

Abstract

Observations of the 21 cm signal face significant challenges due to bright astrophysical foregrounds that are several orders of magnitude higher than the brightness of the hydrogen line, along with various systematics. Successful 21 cm experiments require accurate calibration and foreground mitigation. Errors introduced during the calibration process such as systematics can disrupt the intrinsic frequency smoothness of the foregrounds, leading to power leakage into the Epoch of Reionisation window. Therefore, it is essential to develop strategies to effectively address these challenges. In this work, we adopt a stringent approach to identify and address suspected systematics, including malfunctioning antennas, frequency channels corrupted by radio frequency interference, and other dominant effects. We implement a statistical framework that utilises various data products from the data processing pipeline to derive specific criteria and filters. These criteria and filters are applied at intermediate stages to mitigate systematic propagation from the early stages of data processing. Our analysis focuses on observations from the Murchison Widefield Array Phase I configuration. Out of the observations processed by the pipeline, our approach selects 18%, totalling 58 h, that exhibit fewer systematic effects. The successful selection of observations with reduced systematic dominance enhances our confidence in achieving 21 cm measurements.

Type
Research Article
Copyright
© The Author(s), 2024. Published by Cambridge University Press on behalf of Astronomical Society of Australia

1. Introduction

The Epoch of Reionisation (EoR) marked a significant transition in the history of the Universe, about 400 million years after the Big Bang, when the first galaxies along with other cosmic structures formed, and the intergalactic medium transitioned from a neutral state to an ionised state. This epoch unveils a wealth of information including formation and ionisation of the first cosmic structures aiding us to gain insights into the early universe’s physical processes. One way to study this era is through mapping of neutral hydrogen and subsequently tracing its evolution. Neutral hydrogen emits and absorbs radiation at the specific wavelength of 21 cm. The 21 cm hydrogen line (HI) corresponds to the transition between two energy states of the hydrogen atom, specifically the spin-flip transition of the electron in ground state.

The 21 cm HI can be mapped through its spatial fluctuations, measured from the difference in brightness temperatures across the intergalactic medium, encoding information about the density and ionisation state of neutral hydrogen, as well as the clustering and growth of cosmic structures. The underlying physical processes driving the EoR can be inferred through statistical properties of these fluctuations using power spectrum analysis (Furlanetto Reference Furlanetto and Mesinger2016). Experiments such as the Giant Metrewave Radio Telescope Epoch of Reionisation (GMRT, Paciga et al. Reference Paciga2011), the Hydrogen Epoch of Reionisation (HERA, DeBoer et al. Reference DeBoer2017; Berkhout et al. Reference Berkhout2024), the Murchison WideField Array (MWA, Tingay et al. Reference Tingay2013; Wayth et al. Reference Wayth2018) and the LOw Frequency ARray (LOFAR, van Haarlem et al. Reference van Haarlem2013) are currently focused at the statistical detection of the 21 cm HI. The aforementioned instruments are a combination of both first and second-generation telescopes focused at studying large-scale structures before and during reionisation between redshifts of 6–11. HERA has recently reported the most stringent upper limits of $\Delta^2 \leq (21.4 \, \mathrm{mK})^2$ at $k=0.34 \, h$ $\mathrm{Mpc}^{-1}$ and $\Delta^2 \leq (59.1 \, \mathrm{mK})^2$ at $k=0.36 \,h\,\mathrm{Mpc}^{-1}$ at redshifts of $7.9$ and $10.4$ , respectively, placing new constraints on astrophysical parameters of reionisation. These results suggest heating of the intergalactic medium above the adiabatic cooling limit must have occurred by at least $z=10.4$ (Abdurashidova et al. Reference Abdurashidova2022; HERA Collaboration et al. Reference Collaboration2023).

Figure 1. Data processing pipeline from extracting data to constructing a power spectra. The intermediate processes where inspection and filtering of the data products are carried out, are indicated in red. Each of the stages portrayed in black boxes are explained throughout the paper.

While 21 cm experiments hold great potential to enhance our understanding on the evolution of the early Universe, they are challenged by strong astrophysical foregrounds, both Galactic and extraGalactic, several orders of magnitude higher than the 21 cm HI. To date, two fundamental approaches have been applied solely or in a hybrid system to mitigate foreground contamination: (1) the subtraction method whereby foregrounds are modelled and subtracted from the data (Mitchell et al. Reference Mitchell, Greenhill, Wayth, Sault, Lonsdale, Cappallo, Morales and Ord2008; Morales, Bowman, & Hewitt Reference Morales, Bowman and Hewitt2006; Bernardi et al. Reference Bernardi, Mitchell, Ord, Greenhill, Pindor, Wayth and Wyithe2011, Reference Bernardi2013; Morales et al. Reference Morales, Beardsley, Pober, Barry, Hazelton, Jacobs and Sullivan2019); and (2) the avoidance scheme where foregrounds are constrained to lower spatial scales and an ‘EoR window’ is defined (Morales & Hewitt Reference Morales and Hewitt2004; Vedantham, Udaya Shankar, & Subrahmanyan Reference Vedantham, Udaya Shankar and Subrahmanyan2012; Liu, Parsons, & Trott Reference Liu, Parsons and Trott2014a; Liu, Parsons, & Trott Reference Liu, Parsons and Trott2014b). However, both techniques are prone to calibration errors (Barry et al. Reference Barry, Hazelton, Sullivan, Morales and Pober2016; Patil et al. Reference Patil2016; Ewall-Wice et al. Reference Ewall-Wice, Dillon, Liu and Hewitt2017), uncertainties in foreground and primary beam (i.e., Neben et al. Reference Neben2016; Procopio et al. Reference Procopio2017; Ewall-Wice et al. Reference Ewall-Wice, Dillon, Liu and Hewitt2017; Nunhokee et al. Reference Nunhokee2020), and systematics such as radio frequency interferences (RFI), instrumental polarisation leakage and mutual coupling between antennas, etc (Moore et al. Reference Moore2017; Nunhokee et al. Reference Nunhokee2017; Wilensky et al. Reference Wilensky, Morales, Hazelton, Barry, Byrne and Roy2019; Barry et al. Reference Barry2019; Kern et al. Reference Kern2020; Josaitis et al. Reference Josaitis, Ewall-Wice, Fagnoni and de Lera Acedo2022; Charles et al. Reference Charles2022; Kolopanis et al. Reference Kolopanis, Pober, Jacobs and McGraw2023; Wilensky et al. Reference Wilensky2023; Murphy et al. Reference Murphy2023). These errors, if not treated can potentially lead to biases and leakages in our EoR measurements. Our work presents a strategy to mitigate these systematics by quantifying them for each observation through a set of metrics to prevent them from propagating to the power spectrum.

The paper is broken down such that Section 2 introduces the methodology followed by details of observation setup in Section 3. The data processing pipeline is discussed in Section 4 where detailed explanation of each stage is provided. Section 5 describes the power spectrum analysis, and section 6 talks on the data quality assessment. Results are interpreted in Section 7 and conclusions are drawn in Section 8.

2. Methodology

Astronomers have been grappling with systematic propagation to avoid biases in the 21 cm power spectra measurements for years. Two fundamental methods have been employed to date:

  1. 1. Identify potential systematics and discard or flag them.

  2. 2. Identify potential systematics and apply mitigation techniques to address them.

Systematics can arise from the presence of corrupted timestamps, corrupted frequency channels, RFI sources, instrumental leakages, as well as from unknows origins. Efforts have been dedicated towards detecting the known systematic sources and alleviating them. However, we are not confident about the goodness of data points surrounding the corrupted ones. We could potentially extend the flags to neighbours, but this avenue may turn into an indefinite process (Offringa Reference Offringa2010). A quantitative approach to how much the identified flags could leak into the non-identified is required and this demands precise understanding of the systematic source. Further, mitigation techniques struggle to remove systematics of unknown origins which ultimately leave some traces behind. These residual systematics, even with low intensities could potentially harm 21 cm measurements (Wilensky et al. Reference Wilensky, Morales, Hazelton, Barry, Byrne and Roy2019; Kolopanis et al. Reference Kolopanis, Pober, Jacobs and McGraw2023). The data might also be prone to uncertainties associated with the mitigation techniques, adding to the existing systematics.

This work embraces the first method where we reject any outlying observations. We developed a statistical framework that interrupts the data processing pipeline such that the output data products are thoroughly inspected before they proceed to the next step. It is designed such that any dysfunctional antennas, bad timestamps or frequency channels are discarded before the filtering process. After successful passing through the preliminary gateways, a set of filters are formulated from the derived metrics to identify outlying observations. However, the derived metrics may not be sufficiently robust to capture faint systematics (e.g. ultra-faint RFI, Wilensky et al. Reference Wilensky2023).

The data processing pipeline is shown in Fig. 1 whereby statistical metrics are derived and administered at the intermediate steps, with some of the main ones highlighted in red. The caveat of this strict approach is reduction in the number of observations that survives. Nevertheless we believe it is better to prevent systematics from escaping into the final measurements that would induce biases. We used observations from the MWA to implement the statistical framework. Details of the observational setup are presented in the next section.

3. Observations

The MWA is a radio telescope, located at Inyarrimanha Ilgari Bundara, the Commonwealth Scientific and Industrial Research Organisation (CSIRO) Murchison Radio-astronomy Observatory, in the mid-west of Western Australia, about 300 km inland from the coastal town of Geraldton. The location is considered pristine to study the evolution of our Universe for its low-level radio frequency interference (Offringa et al. Reference Offringa2015). The instrument serves as a precursor for the Low-Frequency Square Kilometre Array telescope, currently under construction on the same site.

The development of MWA is split into several phases. The instrument kicked off with 32 tiles in 2009 (Phase 0, Ord et al. Reference Ord2010), extending to 128 tiles in December 2012 (Phase I; Fig. 2, Tingay et al. Reference Tingay2013). It was further upgraded to Phase II through the addition of 72 new tiles arranged in two compact hexagons along with 56 existing tiles pseudo-randomly spread (for detailed information refer to Wayth et al. Reference Wayth2018). However, this work is restricted to observations from Phase I configuration. Each tile is made up of 4 $\times$ 4 dual polarised dipoles, optimised to operate between 80–300 MHz.

Figure 2. The telescope configuration during Phase I with 128 tiles pseudorandomly placed within a circumference of about 1.5 $\sim$ km.

The telescope was steered at seven pointings listed in Table 1. In this paper, we will be targeting Phase I high band frequency observations between 167–197 MHz from EoR0 field centred at (RA=0 h, DEC $=-27^{\circ}$ ). The field consists of foregrounds contributed by the setting of the Galactic plane (Barry et al. Reference Barry, Line, Lynch, Kriele and Cook2023) on the western horizon. The EoR experiment has observations spanning from 2013 to 2015 for the Phase I configuration. Each observation lasted for 2 min, totalling to about 322 h (9 655 in number). A breakdown of the data with respect to pointings are illustrated in Fig. 3.

Table 1. The altitudes and azimuths of the seven telescope beam pointings used in this analysis. Pointing 0 is the zenith, -3 is three pointings before zenith, and 3 is three pointings after zenith.

Figure 3. The bars in blue show the hours of observations extracted from MWA archival database for the individual pointing. The bars in green show the data that were selected based on the constraints described in Section 6.1. Number of nights associated with the pointings are delineated on the bars.

4. EoR data pipeline

The data described in Section 3 were first downloaded from the MWA All-Sky Virtual Observatory (ASVO) database as raw files produced by the correlator. The downloaded data were then passed through the EoR pipeline. The boxes highlight the processes that include flagging, calibration, foreground subtraction, delay transform analysis, imaging, and power spectrum analysis. The data quality assessment procedures are denoted by the red rhombuses.

4.1. Flagging and pre-processing

We applied AOFlagger (Offringa Reference Offringa2010; Offringa, van de Gronde, & Roerdink Reference Offringa, van de Gronde and Roerdink2012) with the default MWA RFI strategy settings to the data. In addition to RFI identification and mitigation, AOFlagger flags known corrupted frequency channels namely the edges and centre of the coarse bands. Pre-processing was then conducted by Birli where the data was transformed from the correlator output format to a UVFITS format. The frequency channels were averaged to 40 kHz and the time intervals to 2 s. The receiving signal chain undergoes a state change at the start of each observation, occurring 2 s after the initiation of the GPS time. Additionally, the ending timestamps may potentially be affected by pointing, frequency or attenuation changes, rounded up to the next correlator dump time. Therefore, all timestamps 2 s after the start and 1 s before the end of each observation were flagged. The time flags vary across observations for several reasons: (1) only common timestamps of the coarse frequency channels were used; (2) some observations had late starting or early ending times; (3) averaging setting across time were different due to a mix of time resolutions in our observations, resulting in different weights being assigned. While averaging in frequency, the centre channel and 80 kHz edge channels were flagged per $1.28$ MHz coarse band.

We then focused on the autocorrelated visibilities, where the signal from one antenna is correlated with itself. Since we expect the bandpass gains to behave similarly across antennas, potential outliers can be identified from the autocorrelations before calibration. As the gains are stable across each observation (2 min interval), we averaged the autocorrelations in time. It is important to note that the autocorrelated visibilities were normalised by a reference antenna, which was taken to be last antenna in a non-flagged instrument configuration. Modified z-scores were evaluated on the amplitudes of the averaged autocorrelations. Antennas with modified z-scores greater that $3.5$ were identified as outliers or dysfunctional antennas. The z-score analysis was iterated until no outliers were found. Subsequently, the dysfunctional antennas were flagged during calibration.

4.2. Calibration

In a two-element radio interferometer, the correlation of the signal received at each element, termed as visibility, is measured. Assuming a flat sky, the relationship between the sky brightness distribution $\mathbf{S}$ and the visibility $\mathbf{V}$ across a baseline is given by

(1) \begin{equation} \mathbf{V}(\mathbf{b}, \nu) = \int_{\Omega} \mathbf{A} ({\hat{\mathbf{r}}}, \nu) \mathbf{S}({\hat{\mathbf{r}}}, \nu) e^{-2\pi i \nu \frac{\mathbf{b}. {\hat{\mathbf{s}}}}{c}} ~d\Omega.\end{equation}

Here, $\mathbf{b} = (u,v, w)$ represents the baseline projection, ${\hat{\mathbf{r}}}$ is the unit vector representing the direction cosines on the celestial sphere and $\nu$ is the observing frequency. The primary beam response of the antenna is denoted by the $2\times 2$ matrix:

(2) \begin{equation}\mathbf{A}=\begin{pmatrix} A_{EW} & \quad D_{EW}\\[3pt] D_{NS} & \quad A_{NS}\end{pmatrix}\end{equation}

where $A_x$ and $A_y$ represent the antenna response along the East-West (EW) and North-South (NS) directions, respectively, and $D_{EW}$ and $D_{NS}$ are the terms that describe any instrumental leakage resulting from the signal from one polarisation escaping into the other (see Hamaker, Bregman, & Sault Reference Hamaker, Bregman and Sault1996; Offringa, van de Gronde, & Roerdink 2017). The signal received at the antenna gets corrupted along its propagation path by both direction independent and direction dependent antenna gains, thereby corrupting the visibilities. In this work, we solved for only the direction independent gains with Hyperdrive (Jordan Reference Jordan2023), leaving the direction dependent gains for future. We used the MWA Long Baseline Epoch of Reionisation Survey (LoBES) catalogue as the foreground model, derived from the EoR0 field targeting EoR experiments (Lynch et al. Reference Lynch2021). Given that the model contains information only for the Stokes I parameter, the remaining three Stokes components in equation (1) were assumed to be zero. We utilised the Full Embedded Element (FEE) primary beam model generated with Hyperbeam (Sokolowski et al. Reference Sokolowski2017).

The calibration process involves applying least square minimisation, iteratively solving for per-antenna complex gains for each frequency and time, enabling the capture of spectral structures. With the number of iterations set to 50 and the convergence threshold at $10^{-6}$ , the solutions encapsulated most spectral features. Ideally, a convergence closer to zero is desirable, but this would require increasing the number of iterations, thus increasing computational requirements. As this work focuses on the diagnosis side of the data processing pipeline, the convergence aspect falls beyond the scope of this paper. For each antenna, frequency channels that did not reach the specified threshold were flagged within the calibration process, possibly due to RFI or systematics.

We then investigated the calibration solutions for anomalies. The gain solutions were normalised with respect to the last unflagged antenna. We began by evaluating the RMS of the gain amplitudes and antennas and a three sigma thresholding was applied for identifying misbehaving antennas. However, as depicted in Fig. 4, we were not able to spot poor behaviour such as fast fringing of the phases at low or high frequencies from the amplitudes. These behaviours were hence, manually identified and removed.

Figure 4. Top: Amplitudes of the gain solutions obtained for an observation with the colours representing individual antennas. No misbehaving antenna are identified. Bottom: Phases of the calibration solutions for the corresponding observation. The black points highlights the antenna with high fringing phases at low frequencies.

The solutions were then applied to form calibrated visibilities that were fed into foreground mitigation step.

Figure 5. Left: Deconvolved images generated through the process described in Section 4.4 for a zenith pointing observation with (a) representing Stokes I formed from unsubtracted visibilities, (b) and (c) representing Stokes I and V from subtracted and ionospheric phase corrected visibilities, respectively, and (d) difference between source subtracted visibilities before and after phase offset correction. The units of the colorbar is Jy $\mathrm{beam}^{-1}$ . The inset shows a zoom version of the region surrounding the brightest source in the field of view $PKS0026-23$ .

4.3. Foreground subtraction

In this work, we employed a foreground subtraction approach, wherein we modelled and subtracted foreground visibilities for 4 000 sources within the field of view. These model visibilities were constructed following the methodology outlined in Section 4.2, utilising the LoBES catalogue and FEE primary beam. This method effectively decreased foreground contamination at low k modes by approximately an order of magnitude in the EoR window. However, our analysis revealed shortcomings in the results obtained through standard subtraction techniques, manifesting as either under-subtraction or over-subtraction. This behavior can primarily be attributed to direction-dependent effects caused by the ionosphere, leading to positional phase shifts. Such phenomena have been previously investigated in MWA observations and addressed through the ‘peeling’ technique, described in Mitchell et al. (Reference Mitchell, Greenhill, Wayth, Sault, Lonsdale, Cappallo, Morales and Ord2008).

Following suit, we incorporated this peeling approach into our analysis, correcting phase offsets for the 1 000 brightest sources in the field of view. The selection of sources for peeling was determined based on the minimal requirement of approximately 50–60 sources for an image size of ${\sim}30^{\circ}$ , as evaluated in Mitchell et al. (Reference Mitchell, Greenhill, Wayth, Sault, Lonsdale, Cappallo, Morales and Ord2008). However, the computational resources imposed a cap on the number of sources that could be included. We evaluated the efficacy of this peeling technique through imaging improvements, which will be discussed in the following subsection.

4.4. Imaging

The next step in our data processing pipeline is generating images to reinforce the quality assurance through visual inspection and add to the statistical metrics. Images with angular resolutions of 40 arcsec were formed by Fourier transforming the visibilities along the East-West and North South directions (EW and NS polarisations) using WSClean (Offringa et al. Reference Offringa2014). All the sub bands were combined together using the multifrequency synthesis algorithm. Briggs weighting with robustness of $-1$ was used such that we emphasised more on the resolution and reduction of the sidelobes but at the same time increasing the signal to noise ratio for quality assurance (Briggs Reference Briggs1995). Cotton–Schwab algorithm was employed and the images were deconvolved down to a threshold of 1 Jy, chosen to reduce computational cost as deeper cleaning was not required for diagnostic purpose. Example of a pseudo-Stokes I image is shown in plot (a) of Fig. 5. Stokes V images were also created in the same fashion.

Panel (b) of Fig. 5 were formed from the subtracted visibilities that were corrected for ionospheric phase offset. The overall RMS in Stokes I drops from $1.66$ to $0.5$ Jy beam $^{-1}$ . We found that subtraction across EW polarisation performed better with a decrease in RMS by a factor of 60 while for NS, the factor is 23 because of the galactic plane aliasing being more prominent along this direction. Stokes V plotted in panel (c) showed marginal difference after subtraction. The improvement made to the subtraction after accounting for phase offsets is illustrated in the difference image in (d). It is observed that apart from a better subtraction of the brightest source in the field of view $PKS0026-23$ resulting into reduced sidelobe intensities, there are other visible sources that were under subtracted. We also found a flux difference of 900 mJy in $PKS0026-23$ . The overall RMS for this observation drops by an order of magnitude for EW polarisation while marginal difference is found in NS and no difference in Stokes V. The RMS across all the pixels for each observation is plotted in Fig. 6. The mean is seen to be shifted slightly to the left after correction and the standard deviation is more constrained for both polarisations. These pointers indicate a significant refinement in the foreground removal.

Figure 6. RMS distribution across all observations before and after phase offset correction for EW (left) and NS (right) polarisations.

After foreground subtraction, observations were fed into the power spectrum machinery.

5. Power spectrum analysis

The power spectrum defines the power of the signal as a function of k modes. In k space, (u, v) represents the Fourier modes of the measured visibility points, (l, m) are the angular modes $k_{\perp}$ and $k_{||}$ are modes paralell to the line-of -sight mapped from the spectral channels. The power spectrum can then be given by

(3) \begin{align} P(k) &= P\left(\sqrt{k_{\perp}^2 + k_{\parallel}^2}\right) \nonumber \\[4pt]& = \frac{1}{\Omega}\left\langle \tilde{\mathbf{V}}(k) \tilde{\mathbf{V}}^*(k)\right\rangle \end{align}

where $\Omega$ is the observing volume. The visibilities in Equation (1) are gridded onto a uv grid and Fourier transformed along the frequency axis. The one-dimensional power spectra can hence be defined as the integrated power over k space:

(4) \begin{equation} \Delta^2 = \frac{k^3}{2\pi^2} P(k). \end{equation}

In our work, we utilised the Cosmological HI Power Spectrum estimator (CHIPS) pipeline for generating power spectra from MWA observations (Trott et al. Reference Trott2016). While we did not perform a direct comparison or validation of our results with other approaches, we believe it is valuable to explain our choice.

Among the active pipelines for MWA data power spectrum generation, including Fast Holographic Deconvolution (FHD/eppsilon; Sullivan et al. Reference Sullivan2012; Barry et al. Reference Barry2019) and simpleDS (Kolopanis et al. Reference Kolopanis, Pober, Jacobs and McGraw2023), we opted for CHIPS due to several key considerations. Firstly, we ruled out the simpleDS pipeline because it is primarily designed for redundant configurations, targeting Phase II MWA data, which did not align with our observational setup. CHIPS constructs power spectra from a discrete uv-plane, whereas FHD/eppsilon adopts an image-based procedure. By utilising the uvw plane, CHIPS effectively circumvents aliasing issues that FHD/eppsilon encounters. Moreover, CHIPS applies an inverse variance weighting, to account for the frequency-dependent weights in an optimal way. Previous studies (Barry et al. Reference Barry2019) have demonstrated that the results obtained from both CHIPS and FHD/eppsilon pipelines exhibit consistency and follow a general trend. Line et al. (Reference Line, Trott, Cook, Greig, Barry and Jordan2024) also demonstrates that the Hyperdrive-CHIPS pipeline does not suffer from signal loss. This further supports our decision to utilise CHIPS for our analysis, ensuring robust and reliable power spectrum estimation from MWA observations.

6. Data quality assurance

A data quality assessment was conducted at the stages outlined in Fig. 1, enabling us to identify and filter visibilities exhibiting anomalous behaviour and patterns. These anomalies were identified from various data products, as discussed in the following subsections. The cutoff thresholds and number of successful observations that passed through the various stages are presented in Table 2.

Table 2. List of metrics used as diagnosis for filtering observations.

6.1. Data quality issues

The archival data from Section 3 were triaged before pre-processing as described in Section 4.1, to ensure that no data quality uncertainties were associated with them. Elements considered in this process included:

  • Errors in beamformer communication on individual tiles.

  • Discrepancies in attenuation settings of the receiver.

  • Recorded events from the Monitor and Control System.

  • Presence of two or more disabled dipoles along the same polarisation, indicating a flagged tile.

Additionally, observations with high levels of ionospheric activity, capable of distorting our measurements, were identified and discarded. This was achieved using the ionospheric metric developed by Jordan et al. (Reference Jordan2017), which incorporates the median source offset and source offset anisotropy derived from the measured versus expected source positions of 1 000 point sources in the field of view. Observations yielding an ionospheric metric greater than the cut-off threshold estimated in Trott et al. (Reference Trott2020) were excluded, resulting in 4 943 observations (equivalent to 165 h), as depicted by the green area in Fig. 3. The ionospheric distributions are illustrated in Fig. 7, with mean values ranging from 3.8 to 4.5 arcmin across pointings. This metric serves as a proxy for ionospheric activity in an observation, with lower values indicating less ionospheric activity. The Kolmogorov-Smirnov test indicates that these distributions are not normally distributed.

Figure 7. The ionospheric distribution after applying the threshold cut-off of 5. The colors represent the different pointings.

6.2. Flagging occupancy

The flagging occupancy was calculated based on flags generated from data quality issues outlined in Section 6.1, as well as flags obtained through the application of AOFlagger and autocorrelation analysis on the observations (refer to Section 4.1). The left panel of Fig. 8 illustrates the flagging occupancy for a set of MWA observations. Observations with a flagging occupancy greater than 25 $\%$ were discarded.

Figure 8. Left: The total occupancy obtained from the pre-flags and AOFlagger, described in Section 4.1. The flags are displayed only for a portion of the observations to avoid crowding and across the Local Sidereal Time (LST). The negative LSTs should be read as the corresponding LST subtracted from 24 h. Right: Same as the left panel with the total occupancy evaluated from SSINS flagger (Wilensky et al. Reference Wilensky, Morales, Hazelton, Barry, Byrne and Roy2019).

However, studies by Wilensky et al. (Reference Wilensky, Morales, Hazelton, Barry, Byrne and Roy2019) and Barry et al. (Reference Barry2019) revealed that AOFlagger may overlook faint systematics, particularly faint RFI residing below thermal noise. Hence, we further evaluated the flagging occupancy on the flagged visibilities to assess the quality of our data. Notably, flags produced by SSINS were used solely for the filtering process.

The SSINS flagger provides occupancies for four classes of RFI: faint broadband streak, narrow broadband interference, DTV Signal, and total occupancy. Faint broadband refers to systematics of unknown origins occupying a wide band, while narrow interference relates to systematics at a specific frequency or a narrow band. DTV signal represents the full propagation of the DTV interference across all baselines, partly identified by AOFlagger, and total occupancy evaluates the underlying faint systematics identified by the algorithm over the entire observation, as illustrated in the right panel of Fig. 8.

Observations exhibiting any broadband, narrow interferences, or DTV signal were discarded. The averaged flagging occupancy for each night was evaluated for both flaggers, shown in Fig. 9. It serves as another comparison between the occupancies flagged by AOFlagger and SSINS, highlighting faint systematics overlooked by AOFlagger, but identified by SSINS. These faint systematics may originate from faint RFI, sources at the horizon attenuated enough by the primary beam to evade detection by AOFlagger, or other unknown RFI origins. Observations with SSINS occupancy exceeding 25 $\%$ were rejected, resulting in only one quarter of observations remaining (2 161; 72 h). This outcome aligns with the findings of Wilensky et al. (Reference Wilensky, Morales, Hazelton, Barry, Byrne and Roy2019), who reported that one third of the data used for the power spectrum in Beardsley et al. (Reference Beardsley2016) was contaminated by DTV RFI. It is also noteworthy that the difference in occupancies may partly be attributed to the way both flaggers operate: AOFlagger identifies RFI on an antenna basis, while SSINS performs its analysis on a per-baseline mode.

6.3. Calibration solutions

The quality of each observation was further assessed concerning the calibration solutions derived in Section 4.2. The least-squares minimisation algorithm yielded convergence values, indicating the degree to which the solutions approached zero. We employed the variance of these convergence values across frequencies as a deviation metric. Observations with deviations surpassing the square root of the specified stopping threshold were flagged as outliers and treated accordingly.

6.4. Delay-transformed power spectrum

The delay transformed power spectrum estimator (Beardsley et al. Reference Beardsley2016) were used to derive:

  • Power Spectra Wedge Power $P_{\mathrm{wed}}$ : The power in Jy $^2$ confined within the area underneath the boundary set by the chromaticity of the instrument for baselines $\lt 100\lambda$ and averaged by the number of contributing cells.

  • Power Spectra Window Power $P_{\mathrm{win}}$ : The power in Jy $^2$ in the EoR window up till the first coarse channel, corresponding to $k_{||} \lt 0.4~h$ $\mathrm{Mpc}^{-1}$ . Baselines $\lt 100 \lambda$ were included in the averaging and the power was normalised by the number of contributing cells (Beardsley et al. Reference Beardsley2016).

We then constructed four data quality metrics with the above quantities to assess our observations and these are:

  1. 1. $P_{\mathrm{win}} \, (\mathrm{unsub})$ : Window power $P_{\mathrm{win}}$ evaluated from the delay-transformed visibilities before foreground subtraction.

  2. 2. $\frac{P_{\mathrm{win}}}{P_{\mathrm{wed}}} \,(\mathrm{unsub})$ : Ratio of window power to wedge evaluated from the delay-transformed visibilities before foreground subtraction.

  3. 3. $P_{\mathrm{win}} (\frac{\mathrm{sub}}{\mathrm{unsub}})$ : Ratio of window power evaluated from the delay-transformed visibilities after foreground subtraction to before subtraction.

  4. 4. $P_{\mathrm{wed}}(\frac{\mathrm{sub}}{\mathrm{unsub}})$ : Ratio of wedge power evaluated from the delay-transformed visibilities after foreground subtraction.

Figure 9. Flagging occupancy evaluated by AOFlagger (solid) and SSINS (dotted) averaged over observation for individual nights.

Given the widely spread distribution of the above mentioned quantities, illustrated in Fig. 10, particularly the distribution of pointing $-3$ , we adopted the interquartile range as it is known for being resilient to extreme values. The derived cut-off thresholds, and the corresponding number of successful observations are presented in Table 2. Observations that portray sufficient leakage into the window, with a maximum power value of 17 Jy $^2$ were thrown. If the ratio of the maximum power in the window to the wedge were greater than $5.7\%$ , the observation were treated as an anomaly. The power removed by our subtraction methodology were quantified using the ratio of maximum power measured after foreground subtraction in both wedge and window regions to the power measured before subtraction. If the ratio, resulted in values greater than 21% and 73% for the wedge and window respectively, the observation was excluded. As a result, all observations from pointing $-3$ were filtered out.

Figure 10. Distribution of metrics from images and delay-transformed visibilities across observations. Colors represent different pointing. The red dashed lines mark the cut-off thresholds used for filtering. The insets in the first two panels are zoomed versions of the distributions for pointing −3, that are excluded from our thresholding.

6.5. Images

The images generated from the imaging process described in Section 4.4 were used to determine the following:

  • RMS of Stokes V image: The root mean square over a small area (100 by 100 pixels area) sitting at one corner of the image, away from the concentration of source emissions.

  • $PKS0026-23$ flux density S: $PKS0026-23$ is the brightest source in the field of view with a flux density of 17.47 Jy at 150 MHz as reported in the GLEAM survey (Wayth et al. Reference Wayth2015; Hurley-Walker et al. Reference Hurley-Walker2017). Here, a naive extraction of the flux density was done. It was evaluated as the integration over the area centred at the source within a radius spanning two synthesised beams. The radius was chosen to account for any remaining phase offset. The extraction was done separately on both polarisations.

Using the aforementioned quantities, five data quality assessment metrics were constructed:

  1. 1. $V_{\mathrm{rms}} (\mathrm{unsub})$ : RMS across a selected pixel box in Stokes V image.

  2. 2. $\frac{S_{V}}{(S_{\mathrm{EW}} + S_{\mathrm{NS}})}$ : Ratio of $PKS0023-026$ flux density S extracted from Stokes V to sum of flux density extracted from EW and NS polarisations.

  3. 3. $(S_{\mathrm{EW}}, S_{\mathrm{NS}})$ : Difference between $PKS0023-026$ flux densities across EW and NS polarisations.

  4. 4. $S_{\mathrm{EW}}(\frac{\mathrm{sub}}{\mathrm{unsub}})$ : Ratio of $PKS0023-026$ flux density from subtracted image to unsubtracted image along EW polarisation.

  5. 5. $S_{NS}(\frac{\mathrm{sub}}{\mathrm{unsub}})$ : Ratio of $PKS0023-026$ flux density from subtracted image to unsubtracted image along NS polarisation.

The cutoff thresholds for the derived quantities were evaluated using the $3\sigma$ -rule. The results are provided in Table 2. Since the Murchison Widefield Array (MWA) consists of linearly polarised dipoles, we anticipate minimal circularly polarised visibilities, as there are no bright Stokes V sources within our primary beam. The distribution of the pixels in Stokes V should, in principle, be noise-like; therefore, the mean is expected to be around zero, with no skewness. Observations not adhering to this criterion and bearing an RMS value greater than $9.5$ mJy are filtered out. The top panel of Fig. 11 presents the RMS values in Stokes V for observations that passed this criterion as a function of local sidereal time. We observed that this filter excludes all observations from pointing $-3$ . It is evident that the East-West negative pointings exhibit higher RMS values than the positive ones. Pointings $-2$ and $-1$ have the Galactic Centre in the second sidelobe of the primary lobe, and the emission could be leaking into Stokes V. The increase in RMS towards the positive pointings may be attributed to the influence of Fornax A, as it moves into the first sidelobe of the primary beam.

Figure 11. Top: RMS values of Stokes V in Jy beam $^{-1}$ of the chosen observations plotted as a function of LST (adopting same LST convention as Fig. 8). The colours indicate pointing. Bottom Ratio of estimated $PKS0026-23$ flux density from Stokes V to sum of EW and NS images as a function of LST.

The flux density ratio of the source $PKS0023-026$ in Stokes V to the sum of EW and NS polarisations (equivalent to pseudo-Stokes I) was calculated. This quantity informs us about the percentage of instrumental leakage from $(EW+NS) \rightarrow V$ that could also occur in the reverse direction, $V \rightarrow (EW+NS)$ . Observations with an estimated leakage greater than $0.3\%$ were discarded. Fractional ratios for the successful observations are shown in the bottom panel of Fig. 11, following a similar trend as the RMS, except for pointings 2 and 3. The flux densities of $PKS0026-23$ from Stokes V images conform to the RMS trend, except for pointings 2 and 3. This behavior might be attributed to the position of $PKS0026-23$ , such that Fornax A has a negligible contribution. The ratio of the flux density after to before foreground removal was also scanned, such that observations with values below 13% were allowed to proceed.

The aforementioned metrics aimed to mitigate observations dominated by RFI, ionosphere, or contamination from nearby bright emissions. They also identified observations for which calibration and foreground subtraction did not perform well. This under-performance may be attributed to unidentified or unclassified systematics.

7. Results

The filtering process discussed in Section 6 yielded a set of 1 734 observations (58 h) from six pointings. Before delving into constructing the power spectrum from the observations, we compared our pipeline with the traditional MWA Real Time System pipeline (RTS, Mitchell et al. Reference Mitchell, Greenhill, Wayth, Ord, Sault, Doeleman, Kasper and Morales2007) used in Trott et al. (Reference Trott2020).

7.1. Pipeline vs RTS

The RTS pipeline constitutes the following: conversion of raw MWA files to uvfits using cotter; (2) use of AOFlagger for flagging; (3) DI calibration using a sky model ; direction dependent calibration on the five brightest sources, and (4) catalogue subtraction for foreground removal. As observed, there are quite a few differences between the two pipelines. Our pipeline uses the latest LoBES catalogue as the sky model while RTS used the catalogue created from GLEAM along with cross-matched sources from TGSS GRMT described in (Procopio et al. Reference Procopio2017). To reduce comparative complexities, we used the same sky model and propagated the same flags to the observation, hence the comparison here is mainly between the calibration implementation.

We computed the power spectrum for a single observation from both pipelines. Fig. 12 presents the spherically averaged power spectra resulting from both pipelines. The k bins that went into the averaging are $k_{\perp} \lt 0.04 h\,\mathrm{Mpc}^{-1}$ to exclude low contaminated modes and $k_{||} \gt 0.15\, h\,\mathrm{Mpc}^{-1}$ to exclude any possible left over leakage. The spectra was generated across the whole frequency band (167–197 MHz). Both strategies perform similarly, with hyperdrive performing slightly better at some k modes. This result is useful to us as it informs us that any improvements to the power spectrum would be primarily attributed to the data quality assurance strategy.

Figure 12. 0ne-dimensional power spectra constructed by averaging the power over the ( $k_{\perp}, k_{||}$ ) boundaries described in Section 7.1 with top panel present results from EW polarisation and bottom panel from NS. The dotted lines is the thermal noise evaluated from the weights assigned to each of the visibility points.

7.2. Improvements to the power spectrum

We now compare the power spectrum formed before implementing our data quality assurance strategy described in Section 2 to our current pipeline. The first set of observations includes 300 observations randomly picked from the 9 655 datasets we started off with while the second set includes 300 observations chosen from the filtered set. All pointings are included. The comparison of the one-dimensional power spectra averaged over the k bins stated in Section 7.1 is shown in Fig. 13. Our filtering strategy does show an improvement in the power level particularly at low k modes indicating that excluding unreliable observations helps in preventing power to leak beyond the wedge region. This behaviour is observed along both polarisations.

Figure 13. 0ne-dimensional power spectra constructed using 300 observations. The spectra in cyan was from observations selected from the unfiltered set of 9 655 while the one in magenta was form from the filtered set of 1 734 observations. The power is averaged over the ( $k_{\perp}, k_{||}$ ) boundaries described in Section 7.1 with top panel present results from EW polarisation and bottom panel from NS. The dotted lines is the thermal noise evaluated from the weights assigned to each of the visibility points. The insets at the bottom shows the non-log plot of the thermal noise where the slight difference is visible.

7.3. Power spectra for each pointing

We also compared the power spectra generated under the same conditions as mentioned in Section 7.1 for different pointings. After the filtering process discussed in Section 6, we were left with observations for only six pointings (3, 2, 1, 0, –1, –2). Each of these pointings carry different number of observations, therefore to avoid noise bias in our results, we used the same number of observations, amounting to about 4.3 h.

The two-dimensional power spectra for EW polarisation constructed across the full band for the six pointings are displayed in Fig. 14. The black dotted line marks the horizon limit marking the baseline-dependent boundaries. The harmonics are due to the flags applied on the edges and centre frequency channels. The foreground subtraction performed a decent job in reducing the overall foreground power. The bright foreground emission constrained at low $k-$ modes are mostly from the Galactic emission that were not included in the subtraction model.

Figure 14. Two-dimensional power spectra produced from 130 observations (4.3 h) across full frequency band for the different pointing. The power at large angular scales decrease as we shift from negative to positive pointings, showing the effect of the contribution from the Galactic plane. The dashed black lines represents the horizon limit. The dashed blue box indicates the area enclosed by $0.01 h,\mathrm{Mpc}^{-1}$ $\lt k_{\perp} \lt0.04 h\, \mathrm{Mpc}^{-1}$ and Mpc $^{-1} 0.15 h\,\lt k_{||} \lt 3.4 h\, \mathrm{Mpc}^{-1}$ .

The EoR window we are interested in lies above the black dotted lines with a buffer of $0.05 h\, \mathrm{Mpc}^{-1}$ , bounded by $0.01 h\,\mathrm{Mpc}^{-1}$ $\lt k_{\perp} \lt0.04 h\, \mathrm{Mpc}^{-1}$ and Mpc $^{-1} 0.15 h\,\lt k_{||} \lt 3.4 h\, \mathrm{Mpc}^{-1}$ , region enclosed by the blue dashed lines in Fig. 14. We chose these k modes to escape potential foreground spilling into the window. It is hard to provide a quantitative difference between the pointings as they all are behaving at different k modes. The most obvious pattern, analogous to top panel of Fig. 11, are the modes corresponding to $0.01 h,\mathrm{Mpc}^{-1}$ $\lt k_{\perp} \lt 0.015 h\, \mathrm{Mpc}^{-1}$ . The cleanest window within the stated boundary is produced by pointings 0, 1, and 2. Power leakage beyond the horizon limit is more prominent in pointings –2 and –1. Even though the Galactic plane in these pointing is past the second sidelobe of the primary beam response of MWA, it still impacts the foregrounds via aliasing as it sets over the horizon (Barry et al. Reference Barry, Line, Lynch, Kriele and Cook2023).

We also generated power spectra at $z=6.5$ using visibilities across (182, 197) MHz band. The results are similar to the full band in Fig. 14. The dimensionless power spectra $\Delta^2$ , averaged over $ 0.01 \lt k_{\perp} \lt 0.04 \,h$ $\mathrm{Mpc}^{-1}$ and $k_{||} \gt 0.15\, h$ $\mathrm{Mpc}^{-1}$ , is plotted in Fig. 15. The positive pointings have a lower floor compared to the negative ones at large angular scales for both polarisations supporting our statement about the Galactic plane contribution across these modes. The inset shows a clear representation of the power level for each pointing at low k modes. At high k modes the pointings seem to converge.

Figure 15. One-dimensional power spectra produced from 130 observations (4.3 h) over 15 MHZ band (182–197 MHz) for the different pointings, represented by solid coloured lines. The left panel shows results for EW polarisation and right panel for NS polarisation. The dotted lines show the measured thermal noise ( $2\sigma$ plus sample variance) computed across the observations. The inset shows a zoom version of the distribution of the power level at low k modes.

Although some of the pointings performs better than other at specific k modes, the distribution of the power spectra do not indicate any far-flung behaviours. Therefore, we proceeded with the power spectrum analysis using observations from all six pointings.

7.4. Validating our strategy

We analysed our systematic strategy on a set of observations across all pointings, chosen arbitrarily. Our aim was to construct and compare the power spectra at each of the filtering steps. However, we were limited by our pipeline settings and computational resources, preventing us from constructing the power spectra after data quality issues were reported in the database. Therefore, we began with a set of 300 observations that had passed data quality inspection.

Out of these 300 observations, 62 were found to have an ionospheric metric greater than 5. Discarding highly-ionospherically active observations produced a difference of about an order in power at most of the k-modes, as illustrated by the one-dimensional power spectra in Fig. 16. Applying the flagging occupancy left us with 130 observations, raising the power level higher. This rise is due to the reduction in the number of observations, resulting in a higher noise level, delineated by the thermal noise in dashed lines. The metrics from the delay-transformed visibilities rejected 30 observations, slightly improving the results. The image statistics did not spot any misbehaving observations for this set.

Figure 16. One-dimensional power spectra produced over 15 MHZ band (182–197 MHz) from observations as they are ruled out by the different steps namely: quality issues (300 observations), ionoQA (238 observations), flagging occupancy (130 observations) and delay spectra (100 observations). The dotted lines show the measured thermal noise ( $2\sigma$ plus sample variance) computed from the observations.

Since it is not straightforward to validate the power spectra due to the varying number of observations, we use the gap between the power and the thermal noise to infer any improvements. As mentioned in the previous paragraph, the power spectra evaluated after filtering out using the ionoQA show major improvements. Comparing power spectra produced after accounting for flagging occupancy and metrics evaluated from delay spectra also indicate an improvement when the contaminated observations indicated by the delay spectra metrics are removed.

At this point, it is challenging to discuss the specific improvements of flagging occupancy over ionospheric filtering using the same principle. However, the distance of the power from the corresponding thermal noise level does exhibit an improvement. After discarding observations ruled out by flagging occupancy, the power is closer to the thermal noise compared to before filtering these observations, indicating a cleaner dataset.

7.5. Combined power spectra

In the final step, we combined the filtered observations (58 h) described in Section 6 and formed power spectra from the individual polarisations for diagnosis purposes as this paper is intended to present the improved pipeline, discussing explicitly the systematic mitigation approach. The resulting one-dimensional power spectra at $z=6.5$ obtained by averaging over $0.01 \lt k_{\perp} \lt 0.04 \,h$ $\mathrm{Mpc}^{-1}$ and $k_{||} \gt 0.15\, h$ $\mathrm{Mpc}^{-1}$ , is shown in Fig. 17. The lowest measurements for EW and NS polarisations are $\Delta^2= (57.2 \mathrm{mK})^2$ and $\Delta^2= (74.6 \mathrm{mK})^2$ at $k=0.19\,h$ $\mathrm{Mpc}^{-1}$ , respectively.

Figure 17. One-dimensional power spectra produced from 1 796 observations (58 h) over 15 MHZ band (182–197 MHz) for both polarisations. The dotted lines show the measured thermal noise ( $2\sigma$ plus sample variance) computed from 1 734 observations. The grey and blue markers indicate the upper limits obtained in Trott et al. (Reference Trott2020) and Rahimi et al. (Reference Rahimi2021), respectively.

We overlaid upper limits from Trott et al. (Reference Trott2020) and Rahimi et al. (Reference Rahimi2021). However, these upper limits cannot be directly compared with our measurements due to differences in calculation conditions. Rahimi et al. (Reference Rahimi2021) utilised only phase I data and focused on a different observing field. On the other hand, Trott et al. (Reference Trott2020) used observations from the same field but included both phase I and phase II data, employing a different sky model. Incorporating the sky model from Trott et al. (Reference Trott2020) yielded marginal differences. To enhance compatibility, averaging was performed on the same k modes as in Trott et al. (Reference Trott2020).

A back-of-the-envelop comparison of the sensitivity levels between a single phase I and phase II observation yielded an improvement of about 0.6. Applying this factor to our current measurements does indicate an improvement to our current results which is promising.

8. Conclusion and future work

In this paper, we presented a statistical framework whereby metrics were derived at intermediate stages to prevent systematic errors from propagating to the power spectrum. These metrics were used to assess the quality of an observation, informing the pipeline whether it should proceed to the next stage. We found that without these metrics, many systematics, such as bad frequency channels, malfunctioning antennas, and corrupted observations, would not have been identified. When compared to observations from the EoR0 field used in the estimation by Trott et al. (Reference Trott2020), about one third of the observations in common were flagged by our methodology.

Our strategy filtered out 82% of our initial observations, leaving approximately 58 h of data (half the number used in Trott et al. Reference Trott2020). We achieved a comparable lowest floor of $\Delta^2= (57.2 \mathrm{mK})^2$ at $k=0.19,h$ $\mathrm{Mpc}^{-1}$ at $z=6.5$ along the EW polarisation. These results were obtained from observations less dominated by systematic errors, as determined by our statistical framework, increasing the reliability and confidence in our power spectrum results.

We also evaluated the accuracy and reliability of the calibration software Hyperdrive (Jordan Reference Jordan2023). Furthermore, this work explores the latest sky model LoBES, designed specifically for EoR experiments targeting the EoR0 field. Overall, our current processing pipeline implemented in NextFlow (Di Tommaso et al. Reference Di Tommaso, Chatzou, Floden, Prieto Barja, Palumbo and Notredame2023) is efficient, with most integrated software components working harmoniously with minimal human intervention, thereby reducing errors.

The methodology and results presented in this paper can be improved upon and we have identified few avenues:

  1. 1. Including observations from the Phase II compact configuration would help in obtaining lower power levels as demonstrated by our rough estimations.

  2. 2. Increasing the number of iterations for direction-independent calibration. This would enhance convergence of the algorithm producing more accurate complex antenna gains.

  3. 3. Improving on the current calibration algorithm using gain solutions from the autocorrelations (Li et al. Reference Li2019).

  4. 4. Strengthening our current data quality framework, by automating the anomaly detection from the phases of the complex antenna gains and incorporating machine learning.

  5. 5. Clustering observations sharing similar features using the derived statistical metrics to identify the optimal cluster of observations for power spectrum estimation.

Some of the aforementioned strategies are already in development and will be featured in Nunhokee (Reference Nunhokee2024).

Acknowledgements

This research was supported by the Australian Research Council Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO 3D), through project number CE170100013. This scientific work uses data obtained from Inyarrimanha Ilgari Bundara/the Murchison Radio-astronomy Observatory. We acknowledge the Wajarri Yamaji People as the Traditional Owners and native title holders of the Observatory site. Establishment of CSIRO’s Murchison Radio-astronomy Observatory is an initiative of the Australian Government, with support from the Government of Western Australia and the Science and Industry Endowment Fund. Support for the operation of the MWA is provided by the Australian Government (NCRIS), under a contract to Curtin University administered by Astronomy Australia Limited. This work was supported by resources provided by the Pawsey Supercomputing Research Centre with funding from the Australian Government and the Government of Western Australia. The International Centre for Radio Astronomy Research (ICRAR) is a Joint Venture of Curtin University and The University of Western Australia, funded by the Western Australian State government. Some of these data and the pipeline development was undertaken with industry partners, Downunder Geosolutions, and we acknowledge their role in the progress of this project. We thank Michael Wilensky for his useful suggestions on the paper draft. N.B. acknowledges the support of the Forrest Research Foundation, under a postdoctoral research fellowship.

Data availability statement

The data used in this paper is publicly available on MWA Archival System.

References

Abdurashidova, Z., et al. 2022, ApJ, 924, 51. https://doi.org/10.3847/1538-4357/ac2ffc. arXiv: 2108.07282 [astro-ph.CO].CrossRefGoogle Scholar
Barry, N., Hazelton, B., Sullivan, I., Morales, M. F., & Pober, J. C. 2016, MNRAS, 461, 3135. https://doi.org/10.1093/mnras/stw1380. arXiv: 1603.00607 [astro-ph.IM].CrossRefGoogle Scholar
Barry, N., Line, J. L. B., Lynch, C. R., Kriele, M., & Cook, J. 2023, https://doi.org/10.48550/arXiv.2312.07506 CrossRefGoogle Scholar
Barry, N., et al. 2019, ApJ, 884, 1. https://doi.org/10.3847/1538-4357/ab40a8. arXiv: 1909.00561 [astro-ph.IM].CrossRefGoogle Scholar
Beardsley, A. P., et al. 2016, ApJ, 833, 102. https://doi.org/10.3847/1538-4357/833/1/102. arXiv: 1608.06281 [astro-ph.IM].CrossRefGoogle Scholar
Berkhout, L. M., et al. 2024, https://doi.org/10.48550/arXiv.2401.04304 CrossRefGoogle Scholar
Bernardi, G., et al. 2013, ApJ, 771, 105. https://doi.org/10.1088/0004-637X/771/2/105. arXiv: 1305.6047 [astro-ph.CO].CrossRefGoogle Scholar
Bernardi, G., Mitchell, D. A., Ord, S. M., Greenhill, L. J., Pindor, B., Wayth, R. B., & Wyithe, J. S. B. 2011, MNRAS, 413, 411. https://doi.org/10.1111/j.1365-2966.2010.18145.x. arXiv: 1012.3719 [astro-ph.CO].CrossRefGoogle Scholar
Briggs, D. S. 1995, American Astronomical Society Meeting Abstracts, 187, 112.02. American Astronomical Society Meeting Abstracts.Google Scholar
Charles, N., et al. 2022, MNRAS, 512, 2716. https://doi.org/10.1093/mnras/stac709. arXiv: 2112.12765 [astro-ph.CO].CrossRefGoogle Scholar
DeBoer, D. R., et al. 2017, PASP, 129, 045001. https://doi.org/10.1088/1538-3873/129/974/045001. arXiv: 1606.07473 [astro-ph.IM].CrossRefGoogle Scholar
Di Tommaso, P., Chatzou, M., Floden, E., Prieto Barja, P., Palumbo, E., & Notredame, C. 2023, Astrophysics Source Code Library, record ascl:2305.024, May.Google Scholar
Ewall-Wice, A., Dillon, J. S., Liu, A., & Hewitt, J. 2017, MNRAS, 470, 1849. https://doi.org/10.1093/mnras/stx1221. arXiv: 1610.02689 [astro-ph.CO].CrossRefGoogle Scholar
Furlanetto, S. R. 2016, in Understanding the Epoch of Cosmic Reionization: Challenges and Progress, ed. Mesinger, A., 423, 247. Astrophysics and Space Science Library. January. https://doi.org/10.1007/978-3-319-21957-8_9. arXiv: 1511.01131 [astro-ph.CO].CrossRefGoogle Scholar
Hamaker, J. P., Bregman, J. D., & Sault, R. J. 1996, A&ASS, 117, 137.Google Scholar
Collaboration, HERA, et al. 2023, ApJ, 945, 124. https://doi.org/10.3847/1538-4357/acaf50.CrossRefGoogle Scholar
Hurley-Walker, N., et al. 2017, MNRAS, 464, 1146. https://doi.org/10.1093/mnras/stw2337. arXiv: 1610.08318 [astro-ph.GA].CrossRefGoogle Scholar
Jordan, C. 2023, Next generation calibration software for the Murchison Widefield Array radio telescope.Google Scholar
Jordan, C. H., et al. 2017, MNRAS, 471, 3974. https://doi.org/10.1093/mnras/stx1797. arXiv: 1707.04978 [astro-ph.IM].CrossRefGoogle Scholar
Josaitis, A. T., Ewall-Wice, A., Fagnoni, N., & de Lera Acedo, E. 2022, MNRAS, 514, 1804. https://doi.org/10.1093/mnras/stac916. arXiv: 2110.10879 [astro-ph.IM].CrossRefGoogle Scholar
Kern, N. S., et al. 2020, ApJ, 888, 70. https://doi.org/10.3847/1538-4357/ab5e8a. arXiv: 1909.11733 [astro-ph.IM].CrossRefGoogle Scholar
Kolopanis, M., Pober, J. C., Jacobs, D. C., & McGraw, S. 2023, MNRAS, 521, 5120. https://doi.org/10.1093/mnras/stad845. arXiv: 2210.10885 [astro-ph.CO].CrossRefGoogle Scholar
Li, W., et al. 2019, ApJ, 887, 141. https://doi.org/10.3847/1538-4357/ab55e4. arXiv: 1911.10216 [astro-ph.CO].CrossRefGoogle Scholar
Line, J. L. B., Trott, C. M., Cook, J. H., Greig, B., Barry, N., & Jordan, C. H. 2024, https://doi.org/10.48550/arXiv.2404.05140 CrossRefGoogle Scholar
Liu, A., Parsons, A. R., & Trott, C. M. 2014a, PhRvD, 90, 023018. https://doi.org/10.1103/PhysRevD.90.023018. arXiv: 1404.2596 [astro-ph.CO].CrossRefGoogle Scholar
Liu, A., Parsons, A. R., & Trott, C. M. 2014b, PhRvD, 90, 023019. https://doi.org/10.1103/PhysRevD.90.023019. arXiv: 1404.4372 [astro-ph.CO].CrossRefGoogle Scholar
Lynch, C. R., et al. 2021, PASA, 38, e057. https://doi.org/10.1017/pasa.2021.50. arXiv: 2110.08400 [astro-ph.CO].CrossRefGoogle Scholar
Mitchell, D. A., Greenhill, L. J., Wayth, R. B., Sault, R. J., Lonsdale, C. J., Cappallo, R. J., Morales, M. F., & Ord, S. M. 2008, IEEE JSTSP, 2, 707. https://doi.org/10.1109/JSTSP.2008.2005327.CrossRefGoogle Scholar
Mitchell, D., Greenhill, L., Wayth, R., Ord, S., Sault, R., Doeleman, S., Kasper, J., & Morales, M. 2007, American Astronomical Society Meeting Abstracts, 211, 11.03. American Astronomical Society Meeting Abstracts. December.Google Scholar
Moore, D. F., et al. 2017, ApJ, 836, 154. https://doi.org/10.3847/1538-4357/836/2/154. arXiv: 1502.05072 [astro-ph.CO].CrossRefGoogle Scholar
Morales, M. F., Beardsley, A., Pober, J., Barry, N., Hazelton, B., Jacobs, D., & Sullivan, I. 2019, MNRAS, 483, 2207. https://doi.org/10.1093/mnras/sty2844. arXiv: 1810.08731[astro-ph.CO].CrossRefGoogle Scholar
Morales, M. F., Bowman, J. D., & Hewitt, J. N. 2006, ApJ, 648, 767. https://doi.org/10.1086/506135. arXiv: astro-ph/0510027 [astro-ph].CrossRefGoogle Scholar
Morales, M. F., & Hewitt, J. 2004, ApJ, 615, 7. https://doi.org/10.1086/424437. arXiv: astro-ph/0312437 [astro-ph].CrossRefGoogle Scholar
Murphy, G. G., et al. 2023, https://doi.org/10.48550/arXiv.2312.03697 CrossRefGoogle Scholar
Neben, A. R., et al. 2016, ApJ, 820, 44. https://doi.org/10.3847/0004-637X/820/1/44. arXiv: 1602.05249 [astro-ph.IM].CrossRefGoogle Scholar
Nunhokee, C. D. 2024, Clustering of observations for EoR experiments using machine learning.Google Scholar
Nunhokee, C. D., et al. 2017, ApJ, 848, 47. https://doi.org/10.3847/1538-4357/aa8b73. arXiv: 1707.04109 [astro-ph.CO].CrossRefGoogle Scholar
Nunhokee, C. D., et al. 2020, ApJ, 897, 5. https://doi.org/10.3847/1538-4357/ab9634. arXiv: 2005.12174 [astro-ph.IM].CrossRefGoogle Scholar
Offringa, A. R. 2010, Astrophysics Source Code Library, record ascl:1010.017, October. ascl: 1010.017.Google Scholar
Offringa, A. R., et al. 2014, MNRAS, 444, 606. https://doi.org/10.1093/mnras/stu1368.arXiv:1407.1943 [astro-ph.IM].CrossRefGoogle Scholar
Offringa, A. R., van de Gronde, J. J., & Roerdink, J. B. T. M. 2012, A&A, 539, A95. https://doi.org/10.1051/0004-6361/201118497. arXiv: 1201.3364 [astro-ph.IM].CrossRefGoogle Scholar
Offringa, A. R., et al. 2015, PASA, 32, e008. https://doi.org/10.1017/pasa.2015.7. arXiv: 1501.03946 [astro-ph.IM].CrossRefGoogle Scholar
Ord, S. M., et al. 2010, PASP, 122, 1353. https://doi.org/10.1086/657160. arXiv: 1010.1733 [astro-ph.IM].CrossRefGoogle Scholar
Paciga, G., et al. 2011, MNRAS, 413, 1174. https://doi.org/10.1111/j.1365-2966.2011.18208.x. arXiv: 1006.1351 [astro-ph.CO].CrossRefGoogle Scholar
Patil, A. H., et al. 2016. MNRAS, 463, 4317. https://doi.org/10.1093/mnras/stw2277. arXiv: 1605.07619 [astro-ph.IM].CrossRefGoogle Scholar
Procopio, P., et al. 2017, PASA, 34, e033. https://doi.org/10.1017/pasa.2017.26. arXiv: 1707.02288 [astro-ph.IM].CrossRefGoogle Scholar
Rahimi, M., et al. 2021, MNRAS, 508, 5954. https://doi.org/10.1093/mnras/stab2918. arXiv: 2110.03190 [astro-ph.CO].CrossRefGoogle Scholar
Sokolowski, M., et al. 2017, PASA, 34, e062. https://doi.org/10.1017/pasa.2017.54. arXiv: 1710.07478 [astro-ph.IM].CrossRefGoogle Scholar
Sullivan, I. S., et al. 2012, ApJ, 759, 17. https://doi.org/10.1088/0004-637X/759/1/17. arXiv: 1209.1653 [astro-ph.IM].CrossRefGoogle Scholar
Thompson, A. R., Moran, J. M., & Swenson, G. W. Jr. 2017, Interferometry and Synthesis in Radio Astronomy (3rd edn.) https://doi.org/10.1007/978-3-319-44431-4.Google Scholar
Tingay, S. J., et al. 2013, PASA, 30, e007. https://doi.org/10.1017/pasa.2012.007. arXiv: 1206.6945 [astro-ph.IM].CrossRefGoogle Scholar
Trott, C. M., et al. 2016, ApJ, 818, 139. https://doi.org/10.3847/0004-637X/818/2/139. arXiv: 1601.02073 [astro-ph.IM].CrossRefGoogle Scholar
Trott, C. M., et al. 2020, MNRAS, 493, 4711. https://doi.org/10.1093/mnras/staa414. arXiv: 2002.02575 [astro-ph.CO].CrossRefGoogle Scholar
van Haarlem, M. P., et al. 2013, A&A, 556, A2. https://doi.org/10.1051/0004-6361/201220873. arXiv: 1305.3550 [astro-ph.IM].CrossRefGoogle Scholar
Vedantham, H., Udaya Shankar, N., & Subrahmanyan, R. 2012, ApJ, 745, 176. https://doi.org/10.1088/0004-637X/745/2/176. arXiv: 1106.1297 [astro-ph.IM].CrossRefGoogle Scholar
Wayth, R. B., et al. 2015, PASA, 32, e025. https://doi.org/10.1017/pasa.2015.26. arXiv: 1505.06041 [astro-ph.IM].CrossRefGoogle Scholar
Wayth, R. B., et al. 2018, PASA, 35, e033. https://doi.org/10.1017/pasa.2018.37. arXiv: 1809.06466 [astro-ph.IM].CrossRefGoogle Scholar
Wilensky, M. J., Morales, M. F., Hazelton, B. J., Barry, N., Byrne, R., & Roy, S. 2019, PASP, 131, 114507. https://doi.org/10.1088/1538-3873/ab3cad. arXiv: 1906.01093 [astro-ph.IM].CrossRefGoogle Scholar
Wilensky, M. J., et al. 2023, ApJ, 957, 78. https://doi.org/10.3847/1538-4357/acffbd. arXiv: 2310.03851 [astro-ph.CO].CrossRefGoogle Scholar
Figure 0

Figure 1. Data processing pipeline from extracting data to constructing a power spectra. The intermediate processes where inspection and filtering of the data products are carried out, are indicated in red. Each of the stages portrayed in black boxes are explained throughout the paper.

Figure 1

Figure 2. The telescope configuration during Phase I with 128 tiles pseudorandomly placed within a circumference of about 1.5$\sim$km.

Figure 2

Table 1. The altitudes and azimuths of the seven telescope beam pointings used in this analysis. Pointing 0 is the zenith, -3 is three pointings before zenith, and 3 is three pointings after zenith.

Figure 3

Figure 3. The bars in blue show the hours of observations extracted from MWA archival database for the individual pointing. The bars in green show the data that were selected based on the constraints described in Section 6.1. Number of nights associated with the pointings are delineated on the bars.

Figure 4

Figure 4. Top: Amplitudes of the gain solutions obtained for an observation with the colours representing individual antennas. No misbehaving antenna are identified. Bottom: Phases of the calibration solutions for the corresponding observation. The black points highlights the antenna with high fringing phases at low frequencies.

Figure 5

Figure 5. Left: Deconvolved images generated through the process described in Section 4.4 for a zenith pointing observation with (a) representing Stokes I formed from unsubtracted visibilities, (b) and (c) representing Stokes I and V from subtracted and ionospheric phase corrected visibilities, respectively, and (d) difference between source subtracted visibilities before and after phase offset correction. The units of the colorbar is Jy $\mathrm{beam}^{-1}$. The inset shows a zoom version of the region surrounding the brightest source in the field of view $PKS0026-23$.

Figure 6

Figure 6. RMS distribution across all observations before and after phase offset correction for EW (left) and NS (right) polarisations.

Figure 7

Table 2. List of metrics used as diagnosis for filtering observations.

Figure 8

Figure 7. The ionospheric distribution after applying the threshold cut-off of 5. The colors represent the different pointings.

Figure 9

Figure 8. Left: The total occupancy obtained from the pre-flags and AOFlagger, described in Section 4.1. The flags are displayed only for a portion of the observations to avoid crowding and across the Local Sidereal Time (LST). The negative LSTs should be read as the corresponding LST subtracted from 24 h. Right: Same as the left panel with the total occupancy evaluated from SSINS flagger (Wilensky et al. 2019).

Figure 10

Figure 9. Flagging occupancy evaluated by AOFlagger (solid) and SSINS (dotted) averaged over observation for individual nights.

Figure 11

Figure 10. Distribution of metrics from images and delay-transformed visibilities across observations. Colors represent different pointing. The red dashed lines mark the cut-off thresholds used for filtering. The insets in the first two panels are zoomed versions of the distributions for pointing −3, that are excluded from our thresholding.

Figure 12

Figure 11. Top: RMS values of Stokes V in Jy beam$^{-1}$ of the chosen observations plotted as a function of LST (adopting same LST convention as Fig. 8). The colours indicate pointing. Bottom Ratio of estimated $PKS0026-23$ flux density from Stokes V to sum of EW and NS images as a function of LST.

Figure 13

Figure 12. 0ne-dimensional power spectra constructed by averaging the power over the ($k_{\perp}, k_{||}$) boundaries described in Section 7.1 with top panel present results from EW polarisation and bottom panel from NS. The dotted lines is the thermal noise evaluated from the weights assigned to each of the visibility points.

Figure 14

Figure 13. 0ne-dimensional power spectra constructed using 300 observations. The spectra in cyan was from observations selected from the unfiltered set of 9 655 while the one in magenta was form from the filtered set of 1 734 observations. The power is averaged over the ($k_{\perp}, k_{||}$) boundaries described in Section 7.1 with top panel present results from EW polarisation and bottom panel from NS. The dotted lines is the thermal noise evaluated from the weights assigned to each of the visibility points. The insets at the bottom shows the non-log plot of the thermal noise where the slight difference is visible.

Figure 15

Figure 14. Two-dimensional power spectra produced from 130 observations (4.3 h) across full frequency band for the different pointing. The power at large angular scales decrease as we shift from negative to positive pointings, showing the effect of the contribution from the Galactic plane. The dashed black lines represents the horizon limit. The dashed blue box indicates the area enclosed by $0.01 h,\mathrm{Mpc}^{-1}$$\lt k_{\perp} \lt0.04 h\, \mathrm{Mpc}^{-1}$ and Mpc$^{-1} 0.15 h\,\lt k_{||} \lt 3.4 h\, \mathrm{Mpc}^{-1}$.

Figure 16

Figure 15. One-dimensional power spectra produced from 130 observations (4.3 h) over 15 MHZ band (182–197 MHz) for the different pointings, represented by solid coloured lines. The left panel shows results for EW polarisation and right panel for NS polarisation. The dotted lines show the measured thermal noise ($2\sigma$ plus sample variance) computed across the observations. The inset shows a zoom version of the distribution of the power level at low k modes.

Figure 17

Figure 16. One-dimensional power spectra produced over 15 MHZ band (182–197 MHz) from observations as they are ruled out by the different steps namely: quality issues (300 observations), ionoQA (238 observations), flagging occupancy (130 observations) and delay spectra (100 observations). The dotted lines show the measured thermal noise ($2\sigma$ plus sample variance) computed from the observations.

Figure 18

Figure 17. One-dimensional power spectra produced from 1 796 observations (58 h) over 15 MHZ band (182–197 MHz) for both polarisations. The dotted lines show the measured thermal noise ($2\sigma$ plus sample variance) computed from 1 734 observations. The grey and blue markers indicate the upper limits obtained in Trott et al. (2020) and Rahimi et al. (2021), respectively.