key: cord-0630173-gn6as2c3
authors: Hurley-Walker, Natasha; Galvin, Timothy J.; Duchesne, Stefan W.; Zhang, Xiang; Morgan, John; Hancock, Paul J.; An, Tao; Franzen, Thomas M. O.; Heald, George; Ross, Kathryn; Vernstrom, Tessa; Anderson, Gemma E.; Gaensler, Bryan M.; Johnston-Hollitt, Melanie; Kaplan, David L.; Riseley, Christopher J.; Tingay, Steven J.; Walker, Mia
title: GaLactic and Extragalactic All-sky Murchison Widefield Array survey eXtended (GLEAM-X) I: Survey Description and Initial Data Release
date: 2022-04-27
journal: nan
DOI: nan
sha: 198e96aa0dfe5bf31fe2daa2f2a250974480a8f7
doc_id: 630173
cord_uid: gn6as2c3

We describe a new low-frequency wideband radio survey of the southern sky. Observations covering 72 - 231 MHz and Declinations south of $+30^circ$ have been performed with the Murchison Widefield Array"extended"Phase II configuration over 2018 - 2020 and will be processed to form data products including continuum and polarisation images and mosaics, multi-frequency catalogues, transient search data, and ionospheric measurements. From a pilot field described in this work, we publish an initial data release covering 1,447 sq. deg over 4h<RA<13h, -32.7deg<Dec<-20.7deg. We process twenty frequency bands sampling 72 - 231 MHz, with a resolution of $2'$ - $45"$, and produce a wideband source-finding image across 170 - 231MHz with a root-mean-square noise of $1.27pm0.15$ mJy/beam. Source-finding yields 78,967 components, of which 71,320 are fitted spectrally. The catalogue has a completeness of 98% at $sim50$mJy, and a reliability of 98.2% at $5sigma$ rising to 99.7% at $7sigma$. A catalogue is available from Vizier; images are made available on AAO Data Central, SkyView, and the PASA Datastore. This is the first in a series of data releases from the GLEAM-X survey.

Radio sky surveys offer a view of the high-energy sky, probing synchrotron, cyclotron, and thermal processes across a range of distances, from planets and exoplanets to high-redshift radio galaxies. At lower frequencies, the fields of view of radio telescopes are larger, enabling large-scale surveys of the radio sky, such as the National Radio Astronomy Observatory (NRAO) Very Large Array (VLA) Sky Survey (NVSS; Condon et al., 1998) (SUMSS; Bock et al., 1999; Mauch et al., 2003) , and the Low-frequency Sky Survey Redux at 74 MHz (VLSSr; Lane et al., 2014) . Spurred by the development of the Square Kilometre Array (SKA), new radio telescopes are exploring the radio sky across wider areas and frequency ranges than accessible in the past (Fig. 1) . The Murchison Widefield Array (MWA; Tingay et al., 2013) , operational since 2013, is a precursor to the lowfrequency component of the SKA, which will be the world's most powerful radio telescope. The GaLactic and Extragalactic All-sky MWA (GLEAM; Wayth et al., 2015) survey observed the whole sky south of declination (Dec) +30 • from 2013 to 2015 between 72 and 231 MHz. GLEAM has been processed in a multitude of ways: continuum data releases cover most of the extragalactic sky (GLEAM ExGal; Hurley-Walker et al., 2017) , the Magellanic Clouds , the Galactic Plane (GLEAM GP; Hurley-Walker et al., 2019b) , and a deep region over the South Galactic Pole (GLEAM SGP; Franzen et al., 2021a) ; and polarisation products include all-sky circular (Lenc et al., 2018) and linear polarisation surveys (Polarised GLEAM Survey (POGS); Riseley et al., 2018 Riseley et al., , 2020 . Cross-identifications have been provided for the 1,863 brightest radio sources in the midinfrared (the G4Jy Sample White et al., 2020a,b) , and for 1,590 galaxies in the 6dF Galaxy Survey (Franzen et al., 2021b) .

While GLEAM had lower sensitivity and resolution than other surveys of the time (e.g. the First Alternative Data Release of the Tata Institute for Fundamental Research Giant Metrewave Radio Telescope Sky Survey: TGSS-ADR1; Intema et al., 2017) , its major advancement was in leveraging its low frequency and very large fractional bandwidth. Extremely steep spectral indices (α < −2, for S ∝ ν α ) indicate old emission, such as that found in the remnant stage of radio galaxy life cycles (Hurley-Walker et al., 2015; Duchesne & Johnston-Hollitt, 2019) or "fossil" emission in galaxy clusters (Giacintucci et al., 2020) ; rising spectral indices point toward thermal emission such as found in planetary nebulae (Hurley-Walker et al., 2019b) . In this frequency range, absorption effects become important for many sources, allowing measurements to probe synchrotron and free-free absorption in extragalactic radio sources (Callingham et al., 2017) and in Galactic H i i regions (Su et al., 2017) . Additionally, GLEAM's very high sensitivity to large angular scales, often resolved out by interferometric surveys, enabled exploration of diffuse emission such as Galactic supernova remnants (e.g. Hurley-Walker et al., 2019c) and in clusters of galaxies (e.g. Zheng et al., 2018) . In 2017 the MWA underwent an upgrade to "Phase I I", in which an additional 128 tiles were added to the observatory . This enabled observing using two different 128-tile configurations: "compact", comprising many redundant baselines to improve calibration toward statistical detection of the Epoch of Reionisation (Joseph et al., 2018) , and "extended", an array optimised for imaging (within the constraints of the observatory) with maximum baselines of 5.5 km, approximately doubling the resolution of the telescope. The latter layout considerably reduces the sidelobes of the synthesised beam, allowing a more "natural" weighting of the visibility data, which thereby improves the sensitivity of the instrument; sidelobe confusion is also reduced. The smaller main lobe of the synthesised beam reduces the classical confusion limit from ∼ 2 mJy to ∼ 0.3 mJy at 200 MHz (Franzen et al., 2019) . These improvements make it more feasible to integrate for longer times and thereby reach lower noise levels without quickly approaching a confusion floor.

While GLEAM enabled a huge range of science outcomes, better modeling of the foregrounds for searches for the Epoch of Reonisation, and flux density scale calibration of the low-frequency southern sky, it is fundamentally limited by its low (∼ 2 ) resolution and the sensitivity limits of the original configuration of the MWA. We therefore undertook a wide-area survey with the Phase I I extended array to create GLEAM-X, a deeper, higher-resolution successor to GLEAM, with the same sky and frequency coverage, observed over 2018-2020. During that time, the Long Baseline Epoch of Reionisation Survey (LoBES; Lynch et al., 2021) has demonstrated the survey capability of Phase I I by mea-suring the spectral behaviour of 80,824 sources over 100-230 MHz in 3,069 deg 2 , down to a noise limit of ∼ 2 mJy beam −1 , showing the utility of wide-area surveys with the extended array. New radio southern-sky surveys across 800-1400 MHz using the Australian SKA Pathfinder (ASKAP; Hotan et al., 2021) such as the Rapid ASKAP Continuum survey (RACS; McConnell et al., 2020; Hale et al., 2021) have also been developed, offering improved morphological information for millions of radio sources. Fig. 1 shows that the sensitivity of GLEAM-X to ordinary radio galaxies (−0.8 α −0.5) is competitive with other ongoing wide-area surveys such as RACS and the Very Large Array Sky Survey at 3 GHz (VLASS; Lacy et al., 2020) . Note also that its sensitivity to steep-spectrum sources (α = −2.5) is the same as the upcoming Evolutionary Map of the Universe, which will approach the confusion brightness limit at its frequency (EMU; Norris et al., 2011 Norris et al., , 2021 . Covering the northern sky at 6-60 resolution, the LOw Frequency ARray (LOFAR; van Haarlem et al., 2013) is observing several ongoing surveys: the LOFAR Two-metre Sky Survey (LoTSS; Shimwell et al., 2017) , the LOFAR Low-Band Array Sky Survey (LoLSS; de Gasperin et al., 2021) , and the LOFAR Decametre Sky Survey (LoDeSS; van Weeren et al. in prep) .

To reach noise levels that are a significant improvement over GLEAM while still covering a wide area, we accumulate a large (∼ 2 PB) volume of visibility data. Releasing processed data products in stages will be of more use to the community than a single data release in the future. This paper is therefore the first in a series of data releases. We release here a pilot survey area that indicates the qualities that can eventually be expected over the full survey, covering 1,447 deg 2 over 4 h≤ RA≤ 13 h, −32.7 • ≤ Dec ≤ −20.7 • . Polarisation processing and an associated early data release will be described in a companion paper, Zhang et al. (in prep) . Herein we describe the GLEAM-X observations (Section 2), processing pipeline to produce images and mosaics (Section 3), source-finding to generate catalogues (Section 4), and motivate several extensions to the pipeline (Section 5). Section 6 concludes with an outlook on scientific advances enabled by the survey, and plans for further data releases.

All positions given in this paper are in J2000 equatorial coordinates.

GLEAM used a drift scan survey strategy to quickly and efficiently observe the entire sky south of Dec +30 • using the Phase I "128T" configuration of the MWA (Wayth et al., 2015) . In the first year (2013-08 -2014-06) observations were made along the meridian (HA= 0 h), using seven pointings at Declinations centred on −72 • to 18.6 • . In the second year, further observations at HA= ±1 h were taken. By combining the GLEAM data in the image plane over the full range of HA for a region around the South Galactic Pole, Franzen et al. (2021a) were able to reach a noise level of 5 mJy beam −1 at 215 MHz, about half that of the extragalactic data release by Hurley-Walker et al. (2017) , showing that such a strategy was effective.

GLEAM-X therefore adopted a similar strategy, iterating through the same Declination and HAs as GLEAM, but doubling the number of HA= 0 h observations, and using the extended configuration of the Phase I I MWA.

Observations were performed in month-long blocks in order to observe similar ranges in RA across the different Declination and HAs, making it easier to combine many drift scans in large mosaics in simple sky projections, improving the uniformity of sensitivity across the sky.

To cover 72-231 MHz using the 30.72-MHz instantaneous bandwidth of the MWA, five frequency ranges of MHz were cycled through sequentially, changing every two minutes. Gain calibrators were visited on an hourly basis in order to provide a back-up in case of unsuccessful in-field calibration (Section 3.1).

After the first observing run in the 2018-A observing semester 1 , the data were triaged to search for poor ionospheric conditions that would hinder high-quality imaging. We determined calibration solutions for the gain calibrator observations on 30-s cadences, and examined the temporal variability between the first and last time-steps for each observation. Seventeen nights were identified as having unacceptably variable gains, with an average of more than 12 • of phase change between the first and last time-steps of at least one calibrator, a level at which the imaging quality became very poor. These nights were re-observed in the 2019-A semester. In 2020, the COVID-19 pandemic reduced the observing time available in the 2020-A and B semesters in the extended configuration, so at the time of writing, no further observations to replace any other ionospherically disturbed nights have been possible, although further observations have been proposed for 2022. Table A1 summarises the observations taken over the period 2018-2020, including those nights that were reobserved.

The GLEAM-X pipeline is available on GitHub 2 in a containerised version that can be run on any platform with Singularity installed (Kurtzer et al., 2017) .

Some common software packages are used throughout the data reduction. Unless otherwise specified:

• To convert radio interferometric visibilities into im-1 https://www.mwatelescope.org/data/observing 2 https://github.com/tjgalvin/GLEAM-X-pipeline ages, we use the widefield imager W S C l e a n (Offringa et al., 2014) version 2.9, which correctly handles the non-trivial w-terms of MWA snapshot images; versions 2 onward include some useful features such as automatically-thresholded c l e a ning, and multi-scale c l e a n (Offringa & Smirnov, 2017) ; • the primary beam is as defined by Sokolowski et al. (2017) ; however, for speed, all primary beams are precalculated and then interpolated as required using code which is available on github 3 and archived on Zenodo (Morgan & Galvin, 2021 ); • to mosaic together resulting images, we use the mosaicking software s wa r p (Bertin et al., 2002) ; to minimise flux density loss from resampling, images are oversampled by a factor of four when being regridded, before being downsampled back to their original resolution; • to perform source-finding, we use A e g e a n v2.2.5 4 (Hancock et al., 2012 and its companion tools such as the Background and Noise Estimator (B A N E); this package has been optimised for the wide-field images of the MWA, and includes the "priorised" fitting technique, which is necessary to obtain flux density measurements for sources over a wide bandwidth. Fitting errors calculated by A e g e a n take into account the correlated image noise, and are derived from the fit covariance matrix, which quantifies the quality of fitting; if the fit is poor, and the residuals are large, the fitting errors on position, shape, flux density etc all increase appropriately, so it produces useful error estimates for further use.

We now discuss the typical steps undertaken by the pipeline to produce a set of continuum images and catalogues.

Calibration is performed separately on each observation in a direction-independent manner. The sky model is mainly derived from GLEAM, with additional measurements from the literature for the brighter and more complex sources (e.g. Virgo A in this release). The sky model is described in a companion paper (Hurley-Walker et al. in prep) . M i t c h C a l (Offringa et al., 2016) is used to generate a calibration solution for each observation, using the full time range of two minutes. These calibration solutions consist of a complex gain for all 4 polarisation products (i.e. a Jones matrix) per tile, per (40-kHz) spectral channel. Since the sky model is limited by the resolution of GLEAM, we exclude baselines longer than the maximum baseline of the 128T configuration, i.e., 2.5 km (1667λ at 200 MHz); to avoid contamination from diffuse Galactic emission, we also exclude baselines shorter than 112 m (75λ at 200 MHz). Calibration solutions are inspected for each night, and tiles or receivers are flagged if they show instrumental issues (e.g., phases appear random with respect to frequency). This typically affects between 1 and 8 of 128 available tiles per night. We also examine whether the solutions are stable within an observation: rapidly changing gains indicate that ionospheric conditions will dramatically reduce imaging quality (as in Section 2). Observations in this category are triaged and do not proceed to imaging (Section 3.7). Similarly, the stability of the gains over the night is inspected; in good conditions, the phases of the solutions only change slowly, on the order of 10 • on timescales of hours. If more than 20 % of the solutions for a given observation are flagged, we transfer solutions from a well-calibrated observation at the same frequency that is closest in time.

The very brightest radio sources in the sky, the so-called "A-team" sources ( Table 2 in Hurley-Walker et al., 2017) , can cause significant image artefacts if they are just outside the field-of-view or in a sidelobe of the primary beam. Additionally, if they are located inside the field-ofview, the standard deconvolution process (Section 3.2) is not always optimal. To remove these sources from the affected observations, we perform a (u, v) subtraction method. The visibilities are phase-rotated to the location of the source, and a 20 ×20 image of the region is formed, using the following W S C l e a n settings:

• imaging the XX and YY instrumental polarisation products; • each polarisation product is imaged across 64 480-kHz wide channels that are jointly-c l e a ned using the -join-channels option, which also produces a 30.72-MHz wide multi-frequency synthesis (MFS) for each polarisation; • a fourth-order polynomial via the -fit-spectral-pol argument to constrain the spectral behaviour of each clean component; • automatic thresholding down to 3σ, where σ is measured as the root mean square (RMS) of the residual XX and YY MFS images at the end of each major cycle; • a major clean cycle gain of 0.85, i.e. removes 85% of the flux density of the clean components at the end of each major cycle; • "Briggs" (Briggs, 1995) robust parameter of −1; • 10 or fewer major clean cycles, in which the images are inverse Fourier transformed back to visibilities, which are subtracted from the data; • Up to 10 5 minor cleaning cycles, where the subtraction takes place in the image plane.

During this process the "MODEL" column of the mea-surement set 5 is updated with the source components, and after it has completed, is subtracted from the calibrated visibilities. The observation is then phase-rotated back to its original location. In this way, the chromatic effect of the primary beam sidelobe is taken into account when removing the source, without distorting the overall gains of the observation. 6

We also introduce two extra steps in the calibration stage to make the measurement sets ready for polarisation analysis. One is the parallactic angle correction within the primary beam model (Hales, 2017) , transforming the data from the observed frame (linear feeds on the ground) to an astronomical reference frame according to the IAU standard (polarisation angle measured from North through East). This step is necessary for linear polarisation analysis when observations cover a large range of hour angles. To facilitate later polarisation analysis, we set the cross-terms of the calibration Jones matrices to zero, as well as dividing the Jones matrices for all tiles through by a phasor representing the phases of a reference antenna, which is used for all survey processing. At the same time, we add an X-Y phase determined from observations of a strong polarised source with a known polarisation angle (Lenc et al., 2017) . The X-Y phase correction reduces the leakage between linear and circular polarisation, making circularly polarised data available. A detailed description of polarisation calibration, imaging, and a first data release will be given in a separate publication describing the POlarised GLEAM-X survey (POGS-X; Zhang et al. in prep).

At this stage, the processing diverges depending on whether there is significant Galactic emission. For this paper, we focus on producing catalogues and images which best explore the extragalactic sky (i.e. without attempting to reconstruct such diffuse emission). While the original GLEAM survey used an image weighting with a "Briggs" robust parameter −1, such a weighting is not suitable for the MWA Phase I I extended configuration, as the latter has fewer short baselines, reducing the surface brightness sensitivity. For GLEAM-X, a weighting closer to natural is generally preferred to maximise sensitivity (see Hodgson et al., 2020, for a demonstration of the surface brightness sensitivity of MWA Phase I I in comparison to other instruments).

To determine an appropriate weighting for extended MWA Phase I I imaging, taking into account both angular resolution and surface brightness sensitivity, we trial a range of image weightings, including "Briggs" weighting with robust parameters 0.0, +0.5, and +1.0, as well as uniform and natural weightings. We simulate simple 2-dimensional Gaussian sources with varying full-width at half-maximum (FWHM) in individual template 154-MHz 2-min snapshots after subtracting astronomical sources and noise. Two runs of normal snapshot imaging are performed for each Gaussian source-one with multi-scale CLEAN enabled and the other without. The flux density of the resultant Gaussian sources was then measured using the source-finding software a e g e a n to model the Gaussian component. For the purpose of simulating and measuring the model sources at 3, 5, 10, 20, and 1000σ, and an RMS noise level σ is estimated from real template images for the given image weightings. Fig. 2 shows the various image weightings for the imaging with/without multi-scale CLEAN with the a e g e a n flux density measurements of the sources. A significant increase in the recovered flux density during multi-scale CLEAN motivates its use. The 'best' case for flux density recovery is a natural weighting with multi-scale CLEAN, however with natural weighting the improvement in angular resolution compared to GLEAM is only a factor of ∼ 1.5 and the point source sensitivity is not maximised. To balance an increase in resolution while retaining overall sensitivity, a "Briggs" robust parameter of +0.5 is chosen for the full survey. We note that the fraction of flux density loss decreases with increasing source brightness. For instance, comparing the top and leftmost two panels of Fig. 2 , 90 % of the flux density is recovered for a 10 -FWHM 20-σ source, whereas all of the flux density would be recovered for a 1000-σ source of the same size.

While these simulations provide an estimate of the flux density recovery for extended Gaussian sources in snapshot observations, the results shown in Fig. 2 should not be used to directly correct flux density measurements made in the final mosaics. W S C l e a n is used to generate images with the following settings:

• A SIN projection centred on the minimum-w pointing, i.e. hour angle = 0, Dec −26.7 • • four 7.68-MHz channels jointly-cleaned using the -join-channels option, which also produces a 30.72-MHz MFS image; • include and apply the MWA primary beam (Sokolowski et al., 2017) during cleaning, to produce a Stokes I image; • automatic thresholding down to 3σ, where σ is the RMS of the residual MFS image at the end of each major cycle; • automatic c l e a ning down to 1σ within pixels identified as containing flux density in previous cycles ("masked" c l e a ning); • a major cycle gain of 0.85, i.e. 85% of the flux density of the clean components are subtracted in each major cycle; • five or fewer major cycles, in order to prevent the occasional failure to converge during cleaning between 3 and 4σ; • 10 6 minor cycles, a limit which is never reached; • multiscale C l e a n, with the default deconvolution scale settings, and a multiscale-gain parameter of 0.15; • 8000 × 8000 pixel images, which encompasses the field-of-view down to 10% of the primary beam; • "robust" weighting of 0.5 (see above); • a frequency-dependent pixel scale such that each image always has 3.5-5 pixels per FWHM of the restoring beam;

• a restoring beam of a 2-D Gaussian fit to the central part of the dirty beam, which is similar in shape (within 10 %) for each frequency band of the entire survey, but varies in size depending on the frequency of the observation.

The extended configuration of the Phase I I MWA has low sensitivity to sources with extents > 10 , and thus is not optimal for recovering the complex emission present in the Galactic Plane. However, the original GLEAM survey was recorded in an identical set of drift scan pointings to GLEAM-X, and at that time the array configuration provided many baselines with sensitivity to these larger angular scales. Thus, for the Galactic plane, we will jointly deconvolve the short baselines of GLEAM with the full GLEAM-X measurement sets, a process enabled by the fast GPU-based image-domain gridding extension to W S C l e a n (van der Tol et al., 2018) . This method has been used to great effect to image Fornax A and Centaurus A (McKinley et al., 2022) , and can also be used for other extended sources such as the Magellanic Clouds. An example of these results is shown in Fig. 3 and the full description of the process in the context of the Galactic Plane will be demonstrated in a further paper (Hurley-Walker et al. in prep) .

The ionosphere introduces a λ 2 -dependent position shift to the observed radio sources, which varies with position on the sky. Following the method of Hurley-Walker et al. (2017) and Hurley-Walker et al. (2019b) , we use f i t s _ wa r p to calculate a model of position shifts based on the difference in positions between the sources in the snapshot and those in a reference catalogue, and then use this model to de-distort the images.

For the reference catalogue, we benefit from using catalogues with similar resolution (∼1 ) covering wide areas. For declinations north of −30 • , we use NVSS at 1.4 GHz, and for the southern sky SUMSS at 843 MHz. From this combined catalogue we select a subset which is sparse (no internal matches within 3 ) and unresolved (integrated to peak flux density ratio of < 1.2).

For each of the 7.68-MHz sub-bands and the wideband 30.72-MHz images formed from each observation, we estimate the background and RMS noise σ using B A N E, and perform source-finding using A e g e a n, with a minimum "seed" threshold of 5σ. Using the iterative catalogue cross-matching functionality of f i t s _ wa r p, we cross-match the measured sources to the reference catalogue, typically finding 1,000-3,000 cross-matches, from which we retain the 750 brightest sources. A greater number of sources does not improve the accuracy of the warping for typical ionospheric conditions, but does add computational load, so this value was chosen as a point of diminishing returns. These sources typically have flux densities > 1 Jy in the NVSS and SUMSS surveys so have adequate astrometry to form the baselines for our corrections.

Snapshot images with fewer than 100 successful crossmatches are discarded (typically < 1 % of images). The position shifts in the remaining images are typically of order 25 -5 over 72-231 MHz, and are coherent on scales of 1-20 • , similar to previous studies with the MWA (e.g. Helmboldt & Hurley-Walker, 2020) . f i t s _ wa r p uses these position shifts to create a warp model, apply it to all pixels, and interpolate the results back on to the original pixel grid. This technique yields residual astrometric offsets (with no obvious preferred direction or structure) of order 6 at the lowest frequencies, and 2 at the highest frequencies.

While the primary beam model developed by Sokolowski et al. (2017) is significantly more accurate than previous models of the MWA primary beam, there remain some discrepancies between our measured source flux densities and those predicted from existing work. In part, this is due to the flagging of individual dipoles in different tiles across the array, which gives these tiles a different and unmodelled primary beam response. For the observations processed in this work, 72 tiles were fully functional, 39 tiles contained one dead dipole, 14 contained two dead dipoles, and three tiles were flagged for having three or more non-functional dipoles. Including the effect of the flagging by computing and using multiple primary beams at the calibration and imaging stages is computationally expensive, so instead a correction is made after the images are formed.

We cross-match each snapshot with a sparse (no internal matches within 5 ), unresolved (major axis a× minor axis b < 2 × 2 ) version of the GLEAM-derived catalogue used for calibration (Section 3.1) and make a global mean flux density scale correction using the f l u x _ wa r p 7 package (Duchesne et al., 2020) , typically of order 5-15 %. After this global shift, we accumulate the cross-matched tables. Since the discrepancy is consistent in Hour Angle and Dec between snapshots, we can combine the information in this frame of reference.

For each frequency, as a function of HA and Dec, we compare the log 10 of the ratio R of the integrated flux densities of the measured source values and reference catalogue. Similarly to GLEAM ExGal, we find no trends with HA, and up to ±10 % trends in Dec. Fig. 4 shows this effect for a typical frequency band. A fifth-order polynomial model is fitted as a function of Dec using a weighted least-squares fit, where the weights are the signal-to-noise of the sources as measured in each snapshot. The standard deviation of the data from the model (σ poly ) is measured, and sources with |R| > 3 × σ poly are removed from the data. A final model of the same form is fitted to the remaining data, forming a correction function which is then applied to every individual snapshot.

After correction, the primary-beam-corrected 30-MHz MFS images have snapshot RMS values of 35-4 mJy beam −1 over 72-231 MHz at their centres, where the primary beam sensitivity is highest. The upper panel shows the change in log 10 R as a function of Dec, where R is the ratio of measured integrated flux density to model integrated flux density. The lower panel shows the same after the polynomial correction function (blue line) has been fit and applied. The adjacent histogram shows the resulting distribution of log 10 R over the drift scan. The full-width-at-halfmaximum of the resulting histogram is ∼ 2.5 %. Similar results are obtained for other frequency bands.

The goal of continuum mosaicking is to combine the astrometrically-and primary-beam-corrected snapshot images into deeper images with reduced noise, revealing fainter sources and diffuse structures invisible in the individual snapshots. For optimal signal-to-noise when mosaicking the night-long scans together, we use inversevariance weighting. The weight maps are derived from the square of primary beam model, scaled by the inverse of the square of the RMS of the center of the image, as calculated by BANE.

As discussed in Section 2, GLEAM-X was observed at three different hour angles. This gives each drift scan slightly different (u, v)-coverage, which results in a slightly different restoring beam and thus point spread function (PSF). While each individual drift scan would have a unique and very nearly Gaussian PSF, it could be expected that a stacking of different unique Gaussians with different position angles would result in a non-Gaussian shape. Since most source-finders expect sources to be well-approximated by Gaussians, we tested this effect in our mosaicking procedure. We selected the scans with the most dissimilar (u, v)-coverage where there would be significant overlap in sources, those at HAs −1 and +1 from the Dec+2 scans, i.e. where the sky is rotating most quickly and projection effects are most important. We simulated a grid of 1 Jy point sources at common RA and Decs for seven observations from each of these scans, and ran them through our imaging and mosaicking stages, using unity image weighting and neglecting unnecessary astrometric and primary beam corrections. We used A e g e a n to source-find on the resulting mosaic, making corrections as necessary for the projection (Section 4). We recovered the sources at integrated flux densities of 0.995-0.999 Jy and peak flux densities of 0.96-0.97 Jy beam −1 . Subtracting these Gaussian fits from the image plane data, we found residuals at the < 4.5 % level, indicating that level of deviation away from Gaussianity. Since the integrated flux densities were recovered well, and the non-Gaussianity is fairly small, even for this worst-case scenario, we adopt this mosaicking method going forward.

For each 7.68-MHz frequency channel, we form a nightlong drift scan, and examine it to check for any remaining data quality issues. We also form five 30.72-MHz bandwidth mosaics from the multi-frequency synthesis images generated during cleaning (Section 3.2). After quality checking, for each frequency, data from all four nights that cover the same RA range are combined together to make a single deep mosaic. At this stage, we also form a 60-MHz bandwidth "wideband" image over 170-231 MHz, as this gives a good compromise between sensitivity and resolution, and will be used for sourcefinding (Section 4).

As described in Appendix A of Duchesne et al. (2020) , imaging away from the phase centre incurs a significant phase rotation during re-gridding as part of the mosaicking process. This re-projection results in a point-spread function that is not defined at the image reference coordinates. This is corrected partially by introduction of a projected regrid factor, f , that is applied to the PSF major axis to form an 'effective' PSF major axis. For a resultant ZEA projection this is simply related to the change in solid angle over the original SIN-projected image with (e.g. Thompson et al., 2001) 

where l and m are the direction cosines defined with reference to the original, SIN-projected image direction. The ZEA projection itself reduces additional area-related projection effects due to its equal area nature. This is used in initial source-finding on the mosaics as the integrated flux density is correct and the product of the major and minor PSF axes is also correct for the new projection.

Residual uncorrected ionospheric distortions can cause slight blurring of the final mosaicked PSF. This can be characterised by examining sources which are known to be unresolved, which is best determined by using a higher-resolution catalogue than our calibration sky model; we thus use the NVSS and SUMSS combined catalogue described in Section 3.3. Following Hurley-Walker et al. (2017 , 2019b , we cross-match this catalogue with the sources detected in our mosaics at signalto-noise> 10, and then measure the size and shape of these sources in the GLEAM-X mosaics. We create a PSF map by averaging and interpolating over these sources, using Healpix (order= 4, i.e. pixels ∼3 • on each side) as a natural frame in which to accumulate and average source measurements.

After the PSF map has been measured, its antecedent mosaic is multiplied by a (position-dependent) "blur" factor of

where a rst and b rst are the FWHM of the major and minor axes of the restoring beam, and a PSF and b PSF are the FWHM of the major and minor axes of the PSF. This has the effect of normalising the flux density scale such that both peak and integrated flux densities agree, as long as the correct, position-dependent PSF is used . Values of B are typically 1.0-1.2.

The mosaicking stage of Section 3.5 results in 26 mosaics: one with 60-MHz bandwidth, five with 30-MHz bandwidth and the other 20 covering 72-231MHz in 7.68-MHz narrow bands. In this work, we run the pipeline on four nights of observing indicated in Table A1 , producing a large set of mosaics with decreasing sensitivity toward the edges. Here we downselect to a region which is representative of the survey's eventual sensitivity, covering 4 h≤ RA≤ 13 h, −32.7 • ≤ Dec ≤ −20.7 • , for further analysis. Figs. 5-7 show this area for four of the deeper mosaics. Postage stamps of these images are available on both SkyView and the GLEAM-X website 8 . The header of every postage stamp contains the PSF information calculated in Section 3.6, and the completeness information calculated in Section 4.3. We use B A N E to determine the background and RMS noise of each mosaic. During development of this survey, we noticed that B A N E's default of three loops of 3-sigma-clipping is insufficient to exclude source-filled pixels to accurately determine the background and RMS noise. The issue may not have been noticed in previous works due to the relatively higher sensitivity and resulting source density of GLEAM-X (although Hurley-Walker et al. (2017) noted a similar effect from the high sidelobe confusion levels of GLEAM). We modified B A N E to use 10 loops and found that it produced more accurate noise and background estimates (see Section 4.2.2 for further analysis). Fig. 8 shows an example of 10 sq. deg of the 170-231 MHz wideband mosaic and associated background and RMS noise, as well as the same region as seen by GLEAM ExGal, in which the resolution is lower, the noise is higher, and the diffuse Galactic synchrotron on scales of > 1 • is visible.

Combining data in the image plane may lead to the recovery of faint sources that were not cleaned during imaging. The RMS noise levels in the wide-band (30-MHz) mosaics range from 5-1.3 mJy beam −1 over 72-231 MHz. This compares to typical snapshot RMS values of 35-4 mJy beam −1 over the same frequency range (Section 3.4). Cleaning is performed down to 1-σ for components detected at 3-σ in a snapshot (Section 3.2). The centres of each image form the greatest contribution to each mosaic due to weighting by the square of the primary beam (Section 3.5). We can therefore estimate at what signal-to-noise threshold uncleaned sources will typically appear: 3×35 5 = 21 -3×4 1.3 = 9 from 72-231 MHz, and at ∼ 3×4 1 = 12 in the wideband (60-MHz) sourcefinding image (Section 4).

Modelling this effect, especially in conjunction with Eddington bias (see e.g. Section 4.2.1), which is also significant at these faint flux densities, lies beyond the scope of this paper. It would involve significant work and is mainly of interest for performing careful measurement of low-frequency source counts (see Franzen et al., 2019, for an equivalent analysis for GLEAM). At this stage we merely suggest additional caution when using flux densities for sources at low (< 12) signal-to-noise. The mosaics at this stage are only a subset of the GLEAM-X sky. The RMS increases toward the edges due to the drop in primary beam sensitivity and selected RA range of these observations. Future mosaics comprised of further nights of observing will be combined to produce near-uniform sensitivity across the sky.

A source catalogue derived from the images is a useful data product that enables straightforward crossmatching, spectral fitting, and population studies. We aim here to accurately capture components of sizes < 10 across all frequency bands, fitting elliptical twodimensional Gaussians with A e g e a n. We carry out this process on the 1,447-deg 2 region selected in Section 3.5, and the steps are generally applicable to future mosaics produced from the survey.

We follow the same strategy as Hurley-Walker et al. (2017) : using the 170-231 MHz image, a deep wideband catalogue centred at 200 MHz is formed. We set the "seed" clip to four, i.e. pixels with flux density > 4σ are used as initial positions on which to fit components, where σ denotes the local RMS noise. After the sources are detected, we filter to retain only sources with integrated flux densities ≥ 5σ. We then use the "priorised" fitting technique to measure the flux densities of each source in the narrow-band images: the positions are fixed to those of the wide-band source-finding image, the shapes are predicted by convolving the shape in the source-finding image with the local PSF, and the flux density is allowed to vary. Where the sources are too faint to be fit, a forced measurement is carried out. We perform several checks on the quality of the catalogue, detailed below.

In this Section we examine the errors reported in the catalogue. First, we examine the systematic flux density errors; then, we examine the noise properties of the wide-band source-finding image, as this must be close to Gaussian in order for sources to be accurately characterised, and for estimates of the reliability to be made, which we do in Section 4.3. Finally, we make an assessment of the catalogue's astrometric accuracy. These statistics are given in Table 1 .

GLEAM forms the basis of the flux density calibration in this work, and in this Section we examine any differences between the flux densities measured here compared to those measured by GLEAM ExGal. We select compact sources from both catalogues (integrated / peak flux density < 2) that cross-match within a 15 radius, and have a good power-law spectral index fit (reduced χ 2 ≤ 1.93; see Section 4.4). Curved-and peaked-spectrum sources comprise only a small proportion of the catalogue and are more likely to be variable (Ross et al., 2021) , so are not included in this check. We excluded all sources in GLEAM-X data which have a cross-match within 2 in order to avoid selecting sources which are unresolved in GLEAM and resolve into multiple components in GLEAM-X.

As surveys approach their detection limit, measured source flux densities are increasingly likely to be biased high due to noise; there are a larger number of faint sources available to be biased brighter by noise than there are bright sources available to be biased dimmer. Eddington (1913) describes corrections that can be made to an ensemble of measurements to remove this bias. For the purpose of this section, we wish to correct the individual GLEAM flux density measurements in order to check the GLEAM-X flux density scale. We use Eq. 4 of Hogg & Turner (1998) to predict the maximum likelihood true flux density of each of the GLEAM 200-MHz measurements:

where σ is the local RMS noise, and q is the logarithmic source count slope (i.e. the index in dN dS ∝ S q ); at these flux density levels q = 1.54 (Franzen et al., 2016) . Fig. 9 plots the ratio of the two catalogue integrated flux densities as a function of signal-to-noise in GLEAM-X, with a correction applied to the GLEAM flux densities. The ratio trends toward 1.05 at higher flux densities, although the very brightest sources show only small discrepancies from unity. Since the effect is small, we do not attempt to correct for it here, but may revisit our data processing in future to see if it can be reduced, corrected, or eliminated. Since the flux density scale is tied to GLEAM, which has an 8 % error relative to other surveys, this value may be used as an error when combining the data with other work.

No obvious trends are visible in the fitted spectral indices (Fig. 10) ; we note that the error bars on the GLEAM-X measurements are uniformly smaller due to the increased signal-to-noise of the data.

We briefly examine the noise properties of the sourcefinding 170-231-MHz image. We use a 18 deg 2 region centered on RA 10 h 30 m Dec−27 • 30 with fairly typical source distribution. Following Hurley-Walker et al.

(2017), we measure the background of the region using B A N E, and subtract it from the image. We then use A e R e s ("A e g e a n REsiduals") from the A e g e a n package to mask out all sources which were detected by A e g e a n, down to 0.2× the local RMS. We also use A e R e s to subtract the sources to show the magnitude of the residuals. Histograms of the remaining pixels are shown, for the unmasked and masked images, in Fig. 11 .

The higher resolution of the GLEAM-X survey compared to GLEAM means that confusion forms a smaller fraction of the noise contribution, and thus the noise distribution is almost completely symmetric. Surveys close to the confusion limit will see a skew toward a more positive distribution, as seen by Hurley-Walker et al. (2017) . Noise and background maps are made available as part of the survey data release.

Following Hurley-Walker et al. (2017), we measure the astrometry using the 200-MHz catalogue, as this provides the locations and morphologies of all sources. To determine the astrometry, high signal-to-noise (integrated flux density > 50σ) GLEAM-X sources are cross-matched with the isolated sparse NVSS and SUMSS catalogue (Section 3.3); the positions of sources in these catalogues are assumed to be correct and RA and Dec offsets are measured with respect to those positions. The average RA offset is +14 ± 700 mas, and the average Dec offset is +21 ± 687 mas (errors are 1 standard deviation).

In 99 % of cases, fitting errors on the positions are larger than the measured average astrometric offsets. Given the scatter in the measurements, we do not attempt to make a correction for these offsets. As each snapshot has been corrected, residual errors should not vary on scales smaller than the size of the primary beam. Fig. 12 shows the density distribution of the astrometric offsets, and histograms of the RA and Dec offsets, which were used to calculate the values listed in this section.

Following the same procedure as Hurley-Walker et al. (2017) , simulations are used to quantify the completeness of the source catalogue at 200 MHz, using the wideband mosaics. 26 realisations are used in which 25,000 simulated point sources of the same flux density were injected into the 170-231 MHz mosaics (at approximately 20 % of the true source density). The flux density of the simulated sources is different for each realisation, spanning the range 10 −3 to 10 −0.5 Jy in increments of 0.1 dex. The positions of the simulated sources are chosen randomly but not altered between realisations; to avoid introducing an artificial factor of confusion in the simulations, simulated sources are not permitted to lie within 5 of each other. Sources are injected into the mosaics using A e R e s. The major and minor axes of the simulated sources are set to a psf and b psf , respectively.

For each realisation, the source-finding procedures described in Section 4 are applied to the mosaics and the fraction of simulated sources recovered is calculated. In cases where a simulated source is found to lie too close to a real (> 5σ) source to be detected separately, the simulated source is considered to be detected if the recovered source position is closer to the simulated rather than the real source position. This type of completeness simulation therefore accounts for sources that are omitted from the source-finding process through being too close to a brighter source. Fig. 13 shows the fraction of simulated sources recovered as a function of S 200MHz . The completeness is estimated to be 50 % at ∼ 5.6 mJy rising to 90 % at ∼ 10 mJy; these flux densities were typically below the RMS noise in GLEAM ExGal. Errors on the completeness estimate are derived assuming Poisson errors on the number of simulated sources detected. Fig. 14 shows the spatial distribution of the completeness for the work presented here; the slight dependence on RA is largely due to the presence of bright sources in large mosaics, e.g. Hydra A at ∼ RA 09 h 20 m Dec −12 • . The roll-off in Declination is due to the primary beam sensitivity of the single drift scan used in this work; in the full survey, multiple drift scans will be used to ensure near-uniform sensitivity and completeness across the sky.

The completeness at any pixel position is given by C = N d /N s , where N s is the number of simulated sources in a circle of radius 6 • centred on the pixel and N d is the number of simulated sources that were detected above 5σ within this same region of sky. The completeness maps, in f i t s format, can be obtained from the supplementary material. Postage stamp images from the GLEAM-X VO server also include the estimated completeness at representative flux densities in their headers. 

To test the reliability of the source finder and check how many of the detected sources might be false detections, we use the same source-finding procedure as described above but search only for negative peaks. A e g e a n is run with a seedclip of 4σ (allowing for detections with peaks above this limit) and detections outside of the central region are cut. This initially yields 1,144 negative detections. Filtering the results to retain only sources with integrated flux densities S int > 5σ leaves 198 detections. Inspection revealed that some of these detections were artefacts around very bright sources, rather than noise peaks (see Fig. 15 ). There were also similar positive detections of artifacts around these bright sources. We filtered out any detections (positive or negative) that were • within 5 of a positive detection whose peak flux density was ≥ 2 Jy and where the absolute value of the ratio of the fainter peak to the bright peak was ≥ 350; or • within 12 of a positive detection whose peak flux density was ≥ 6 Jy and where the absolute value of the ratio of the fainter peak to the bright peak was ≥ 650.

This accounts for the moderately bright artefacts closer in to the bright sources and fainter artefacts that can exist further out from very bright sources. This filtering cuts 157 positive detections and 149 negative detections. We also note that there is a tendency for negative sources to appear close to positive sources regardless of their brightness, potentially due to faint uncleaned sidelobes slightly reducing the map brightness very close to sources. These negative sources will not have positive counterparts, so potentially can also be filtered before estimating the reliability. The criterion in this case is that they cross-match with a positive source within 2 . An example is shown in Fig. 16 . These comprise a further 46 sources which may optionally be removed.

Comparing the filtered samples of negative to positive detections, we can estimate the number of positive detections that are false detections as a function of signal to noise. For a conservative estimate, where we do not apply the second filter, we find that at a signal-to-noise ratio of five, the number of false detections is just under 2 %, falling quickly to 1 % for S int > 5.5σ. If we also filter negative sources that lie close to positive sources, we find that the reliability is much higher, with only 0.75 % of sources false at 5-σ, and rising to none at 8σ. For each significance bin, we convert these fractions to a reliability estimate and plot them as a function of signal-to-noise in Fig. 17 . We note that were the noise completely Gaussian, we would expect just one +5σ source in this sky area to appear purely by chance, and none with flux density > 5.5σ; i.e., a reliability of 99.999 % in the faintest bin, rising quickly 100 %.

We fit two models to the twenty narrow-band flux density measurements for all detected sources (using S ∝ ν α ). The first model is a simple power-law parameterised as

where S ν0 is the brightness of the source, in Jy, at the reference frequency ν 0 , and α describes the gradient of the spectral slope in logarithmic space. We also extend this power-law model to,

which includes the additional free parameter q to capture any higher order spectral curvature features, where increasing |q| captures stronger deviations from a simple power law; if q is positive, the curve is opening upward (convex) and if q is negative, the curve is opening downward (concave). This model is not physically motivated, and may not appropriately describe sources with different power-law slopes in the optically thin and thick regimes, but provides a useful filter to identify interesting sources. For both models we set ν 0 to 200-MHz.

To perform accurate spectral fitting, the errors on the flux density measurements must be known. Following Hurley-Walker et al. (2017), spectral fitting allows us to check the flux density consistency of the catalogue. A flux density scaling error of 2 % yields a median reduced χ 2 of unity across the catalogue, whereas higher or lower values bias the reduced χ 2 lower or higher as a function of signal-to-noise. We thus adopt 2 % as the measure The white circle shows a negative source that was not filtered, while the white × shows a negative source that was filtered for being too close to a bright source. Figure 16 . An example of a negative source found next to a positive source that could optionally be filtered when generating the reliability estimate. Black circles indicate detected positive components that are not filtered; the white + shows a negative source that can optionally be filtered. Figure 17 . Estimates of the reliability of the catalogue as a function of signal to noise. The lower blue curve shows a conservative estimate without filtering negative sources detected on the edges of positive sources. The upper red curve shows a more generous estimate derived after filtering these sources out. In comparison, GLEAM ExGal has a reliability of 98.9 %-99.8 % at these signal-to-noise levels.

of our internal flux density scale, and set the errors on the flux density to this value added in quadrature with the local fitting error from A e g e a n. (Note that 8 % is more appropriate when comparing with other catalogues as this is the flux density scale accuracy of GLEAM, to which GLEAM-X is tied (see Section 4.2.1).)

We applied the Levenberg-Marquardt non-linear leastsquares regression algorithm (as implemented in the s c i p y p y t h o n module; Virtanen et al., 2020) to Equations 4 and 5 for each detected source. We did not include narrow-bands with negative integrated flux density measurements. We discarded the fitting results if • there were fewer than 15 integrated flux density measurements for a source; • a χ 2 goodness-of-fit test indicated at a > 99 % likelihood of an incorrectly-fit model; or • q/∆q < 3, to ensure constrained deviations from a power-law are statistically significant.

For this initial data release we included only the model with the lower reduced-χ 2 statistic in our catalogue. Applying these criteria a total of 70,432 and 888 source components have fitting results recorded for power-law and curved power-law models, respectively. Fig. 19 shows five example SEDs, four with either power-law or curved power-law models constrained using exclusively GLEAM-X, and one with GLEAM-X data supplemented with data from SUMSS and NVSS to fit a two-component power-law model described as

where S p is the brightness (Jy) at the peak frequency ν p (MHz), and α thin and α thick are the spectral slopes in the optically thin and optically thick regimes, respectively (Callingham et al., 2017) . For sources fit well by power-law SEDs, the distributions of spectral indices α with respect to flux density are plotted in in Fig. 18 . The median α for the brightest bin is −0.83, in excellent agreement with previous results (e.g. Mauch et al., 2003; Lane et al., 2014; Heald et al., 2015) .

The priorised fitting routine in a e g e a n separates the island finding stage from the component characterisation stage, and is analogous to aperture photometry in optical images . We use this in GLEAM-X to ensure that each radio-component iden- tified in our deep 170-231 MHz source finding image has an equivalent component characterisation in each of the other 25 GLEAM-X images. This process however does not enforce spectral smoothness between images adjacent in frequency. For GLEAM-X, this process becomes less reliable towards lower frequencies, where the PSF becomes large enough that nearby components are blended to the point where their brightness profiles can not be distinguished. Although model optimisation methods may be able to constrain the total brightness across all components, the brightness between individual components become degenerate. We highlight an example of this behaviour in Figure 20 . This problem is most apparent for sources that are slightly resolved and characterised as two separate components within 120 from one another. Further development of a e g e a n to perform component characterisation across all images jointly while including physically-motivated parametisation of the spectra is planned to address this issue.

The resulting catalogue consists of 78,967 radio sources detected over 1,447 deg 2 . 71,320 sources are fit well by power-law or curved-spectrum SEDs. The catalogue has 722 columns (see Appendix B) and is available via Vizier. The catalogue measurements can be used to perform more complex spectral fits, especially in conjunction with other radio measurements. Table 1 shows the properties of the images and catalogue in this data release, as well as some forward predictions for the full survey, in comparison to GLEAM.

The total data volume of GLEAM-X visibilities is large (∼ 2 PB) and file transfer operations comprise a significant proportion (∼ 40 %) of our processing time. When processing the data, each observation takes up ∼ 100 GB of disk space in visibilities, images, and metadata. Given the richness of the GLEAM-X survey, we are strongly motivated to perform additional operations on the data while they reside on disk in order to avoid moving the data more frequently. In this section we discuss the current extensions to the pipeline that we expect will yield a range of science outcomes not possible with mosaicked images.

The wide field-of-view of the MWA combined with the repeated drift scanning strategy of GLEAM-X yields a dataset that is interesting to search for transient ra- gle transient candidate, but understanding its nature was difficult with the (limited) data available. Historically this has been a common occurrence for low-frequency radio transients, with many unusual phenomena detected but never fully understood (e.g. Hyman et al., 2005; Stewart et al., 2016; Varghese et al., 2019) .

The GLEAM-X drift scans were observed such that the LST was matched for repeated observations at the same pointing and frequency. This enabled a search using "visibility differencing", wherein calibrated measurement sets were differenced, and the resulting nearly-empty visibilities were inverted to form a dirty image, which could be used to search for transient sources (Honours thesis: O'Doherty 2022; Hancock et al. in prep.) . One high-significance candidate was followed up using the large MWA archive, resulting in the discovery of a new type of highly polarised radio transient, repeating on the unusual timescale of 18.18 minutes (Hurley-Walker et al., 2022) . The wide bandwidth of GLEAM-X was key to finding the dispersion measure of the source, and therefore estimating its distance.

The visibility differencing approach resulted in a large number of false positives due to the differences in ionospheric conditions between observations. The discovery of a new type of radio transient, and the utility of our polarisation and wideband measurements, motivates the inclusion of a transient imaging step in our routine pipeline processing.

Our approach is to image every 4-s interval of each observation, at the same time subtracting the deep model that was formed during imaging (Section 3.7), the same approach that is currently used for imaging MWA interplanetary scintillation observations (Morgan et al. in prep.) . This results in a thermal-noise-dominated Stokes-I image cube where only differences between each time step and the continuum average are recorded. This cube is then stored in an HDF5 file 9 as described in Appendix 2 of Morgan et al. (2018) . Briefly, the image cube is reordered so that time is the fastest axis, and the pixel data is demoted to half precision (16-bit) floats. This results in a typical data volume of 600 MB per observation. Once in this format, any number of algorithms can be conveniently applied to detect and measure time-domain signals.

While imaging every 0.5-s sample would be ideal, it would multiply by 8× the storage and processing requirements for all other steps of the pipeline, but if a signal of interest is discovered then it is simple (and indeed necessary) to reprocess the data with higher time (and, if needed, frequency) resolution. Future data releases will provide these data and quantitative analyses thereof.

The source position offsets determined during the dewarping process (Section 3.3) yield information about the slant total electron content (dTEC) averaged over the telescope array projected on to the sky in that fieldof-view. If dTEC varies significantly over the array, the wavefronts from different parts of the sky will arrive at different times, and radio sources will appear stretched, duplicated, or will disappear completely. Conversely, if images are created using sub-arrays of the telescope, the apparent difference in source positions can be used to constrain an approximate height of the distorting screen (Loi et al., 2015; Helmboldt & Hurley-Walker, 2020) . We thus add a module to the imaging pipeline to routinely produce these binocular images.

In choosing the sub-arrays from the extended Phase I I, we face a compromise between sensitivity (higher for large sub-arrays) and parallax lever-arm (better for widely-separated sub-arrays). Additionally we have no prior knowledge of what ionospheric activity will be observed on the night, nor the resources to adjust the imaging to match at the time of processing. To form a generally useful product, we split the array into two pairs of sub-arrays following the cardinal directions, shown in Fig. 21 . Each group of 43 or 44 antennas is imaged separately, and source-finding is performed using the default settings of A e g e a n. These catalogues can form a useful input to future analyses of the ionosphere above the Murchison Radioastronomy Observatory; the data and analysis will be released in future work.

In this work we described GLEAM-X, a new wideband low-frequency all-southern-sky survey performed using the MWA, as well as the data reduction steps we expect to use to produce a range of continuum data products over 72-231 MHz. Polarisation data will be described in the upcoming paper by Zhang et al. (in prep) . Extensions to our data reduction pipeline to perform transient searches (Section 5.1) and binocular imaging (Sec-tion 5.2), as well as joint deconvolution of the Galactic Plane (Fig. 3) will further enhance the capabilities of the survey.

To demonstrate the quality and attributes of the images and catalogues that will be produced by GLEAM-X, we release here 1,447 deg 2 of sky in the form of 26 mosaics across 72-231 MHz of bandwidths 60, 30, and 8 MHz, with RMS noises ranging from 15 to just over 1 mJy beam −1 . Additionally, we form a catalogue of 78,967 sources, 70,432 of which are well-fit across our band with power-law spectral energy distributions, and 888 with curved power-law spectra. Extrapolating our source density of 55 deg −2 to the ∼31,000 deg 2 that GLEAM-X will eventually cover, we expect to detect of order 1.7 M sources, and produce ∼ 1.5 M radio spectra.

We plan to release the survey in a series of data releases; the next will comprise a large (∼ 15,000 deg 2 ) set of images and catalogues covering the southern extragalactic sky centered on the South Galactic Pole (Galvin et al. in prep) ; secondly we aim to process and release the complete Galactic Plane (Hurley-Walker et al. in prep) ; finally, we will aim to produce contiguous all-sky coverage. Polarisation, transient, and ionospheric data releases and analyses will also proceed over coming years.

These data will enable a range of science outcomes, some of which are outlined by Beardsley et al. (2019) in their review of scientific opportunities with Phase I I of the MWA. For instance, there is strong potential to detect 10 4 peaked-spectrum sources in GLEAM-X data, an order of magnitude more than discovered by GLEAM (Callingham et al., 2017) , and also probing a population an order of magnitude fainter. Improved signal-to-noise on sources with curved and peaked spectra can provide more efficient selection of high-redshift radio galaxies (Drouart et al., 2020) . Many local star-forming galaxies will be resolved, enabling better understanding of the interplay between thermal and non-thermal processes in their energy budgets (Kapińska et al., 2017; Galvin et al., 2018) .

The extended configuration of the Phase I I MWA has already been used very capably for targeted investigations of the extragalactic sky, such as determining the remnant radio galaxy fraction in one of the Galaxy and Mass Assembly fields (Quici et al., 2021) and detecting diffuse non-thermal emission in galaxy clusters (Duchesne et al., 2021) . Similar studies over the whole sky, particularly exploiting synergies with other recent wide-area surveys such as RACS, are likely to be highly productive. The higher source density of GLEAM-X will for the first time enable cosmological measurements with the MWA. We can resolve the tension between the angular clustering observed with NVSS and TGSS-ADR1 (Dolfi et al., 2019) , investigate differential source counts (Chen & Schwarz, 2015) , and by cross-correlating with measurements of the Cosmic Microwave Background, search for the effects of dark energy via the integrated Sachs-Wolfe effect (Sachs & Wolfe, 1967) . Additionally, GLEAM-X may help to improve sky models for studies of the Epoch of Reionisation, by measuring source brightnesses below 100 MHz, imaging slightly deeper, and separating sources into more components than LoBES (Lynch et al., 2021) .

Continuum Galactic science shows promise with MWA Phase I I (Tremblay et al., 2022) , and given the excellent results from our initial exploration of jointly deconvolving GLEAM and GLEAM-X, we expect to make new detections of supernova remnants (SNRs; see e.g. Hurley-Walker et al., 2019a) and improve measurements of cosmic ray electrons in the Galactic Plane (following Su et al., 2018) . Additionally the improved resolution, sensitivity, and wide bandwidth will make possible the examination of the unshocked ejecta of SNRs (Arias et al., 2018) and interactions with their environments (Castelletti et al., 2021) via measurements of low-frequency thermal absorption. This creates excellent synergy with TeV observations by the High Energy Stereoscopic System (Hinton & HESS Collaboration, 2004; Aharonian et al., 2006) and the upcoming Cherenkov Telescope Array (Acharya et al., 2013) to search for sites of cosmic ray acceleration in our Galaxy (e.g. Maxted et al., 2019) .

The repeated, overlapping epochs of GLEAM-X and its drift scan observing strategy make it possible to explore radio transients and variability on timescales from seconds to years; comparisons to GLEAM enable a seven-year lever arm. Combining these cadences with the wide bandwidth will enable improved investigation of the startling variability of peaked-spectrum sources found by Ross et al. (2021) , and enable distance measurements for dispersion-smeared pulsed transients (Hurley-Walker et al., 2022) . As evinced by the latter work, GLEAM-X opens new parameter space in the low-frequency radio sky, and potentially enables further serendipitous discoveries beyond our ability to predict.

We thank the anonymous referee for their comments, which improved the quality of this paper. NHW is supported by an Australian Research Council Future Fellowship (project number FT190100231) funded by the Australian Government. KR acknowledges a Doctoral Scholarship and an Australian Government Research Training Programme scholarship administered through Curtin University. DK was supported by NSF grant AST-1816492. CJR acknowledges financial support from the ERC Starting Grant 'DRANOEL', number 714245. This scientific work makes use of the Murchison Radio-astronomy Observatory, operated by CSIRO. We acknowledge the Wajarri Yamatji people as the traditional owners of the Observatory site. Support for the operation of the MWA is provided by the Australian Government (NCRIS), under a contract to Curtin University administered by Astronomy Australia Limited. Establishment of the Murchison Radio-astronomy Observatory and the Pawsey Supercomputing Centre are initiatives of the Australian Government, with support from the Government of Western Australia and the Science and Industry Endowment Fund. We acknowledge the Pawsey Supercomputing Centre which is supported by the Western Australian and Australian Governments and the China SKA Regional Center prototype at Shanghai Astronomical Observatory which is funded by the Ministry of Science and Technology of China (under grant number 2018YFA0404603) and Chinese Academy of Sciences (under grant number 114231KYSB20170003). Access to Pawsey Data Storage Services is governed by a Data Storage and Management Policy (DSMP). ASVO has received funding from the Australian Commonwealth Government through the National eResearch Collaboration Tools and Resources (NeCTAR) Project, the Australian National Data Service (ANDS), and the National Collaborative Research Infrastructure Strategy. This paper makes use of services or code that have been provided by AAO Data Central (datacentral.org.au) . This research has made use of NASA's Astrophysics Data System Bibliographic Services. The following software was used in this work: ao f l ag g e r and c o t t e r (Offringa et al., 2012) ; W S C l e a n (Offringa et al., 2014; Offringa & Smirnov, 2017) ; A e g e a n ; m i r i a d (Sault et al., 1995) ; T o p C at (Taylor, 2005) N u m P y (Dubois et al., 1996; Harris et al., 2020) ; A s t ro P y (Astropy Collaboration et al., 2013) ; S c i P y (Oliphant, 2007) , M at p l o t l i b (Hunter, 2007) . This work was compiled in the very useful online L A T E X editor Overleaf. Table A1 GLEAM-X observing summary. The HA and Dec are fixed to the locations shown and the sky drifts past for the observing time shown. Observations typically start just after sunset and stop just before sunrise. The four nights published in this work are shown in bold font. Nights identified as having high ionospheric activity are marked with a "*". −1 −12 7.9 2020-10-03 −1 +1 9.8 2020-10-04* −1 +20 9.8 2020-10-05 0 −71 9.8 2020-10-06 0 −55 9.8 2020-10-07 0 −40 9.8 2020-10-08 0 −26 9.8 2020-10-09 0 −12 9.6 2020-10-10* 0 +1 8.8 2020-10-11 0 +20 9.8 2020-10-12 0 −71 8.6 2020-10-13 0 −55 8.4 2020-10-14 0 −40 5.6 2020-10-15 0 −26 9.1 2020-10-16 0 −12 9.0 2020-10-17* 0 +1 9.8 2020-10-18 0 +20 9.7 2020-10-19 +1 −71 9.5 2020-10-20 +1 −55 9.5 2020-10-21 +1 −40 9.5 2020-10-22 +1 −26 8.2 2020-10-23 +1 −12 9.5 2020-10-24* +1 +1 9.5 2020-10-25 +1 +20 8.0 Total: 1,056.5 Table A2 Column numbers, names, and units for the catalogue. Source names follow International Astronomical Union naming conventions for co-ordinate-based naming. Background and RMS measurements were performed by B A N E (Section 3.7); PSF measurements were peformed using in-house software as described in Section 3.6; the fitted spectral index parameters were derived as described in Section 4.4; all other measurements were made using A e g e a n. A e g e a n incorporates a constrained fitting algorithm. Shape parameters with an error of −1 indicate that the reported value is equal to either the upper or lower fitting constraint. The columns with the subscript "wide" are derived from the 200 MHz wide-band image. Subsequently, the subscript indicates the central frequency of the measurement, in MHz. These sub-band measurements are made using the priorised fitting mode of Aegean, where the position and shape of the source are determined from the wide-band image, and only the flux density is fitted (see Section 4.1). Note therefore that some columns in the priorised fit do not have error bars, because they are linearly propagated from the wideband image values (e.g. major axis a).

Number Name Unit Description 

Astronomical Society of the Pacific Conference Series

The New Mexico Institute of Mining and Technology, Socorro

Astronomical Data Analysis Software and Systems IV. p

Astronomical Data Analysis Software and Systems XIV

Interferometry and Synthesis in Radio Astronomy