key: cord-0630173-gn6as2c3 authors: Hurley-Walker, Natasha; Galvin, Timothy J.; Duchesne, Stefan W.; Zhang, Xiang; Morgan, John; Hancock, Paul J.; An, Tao; Franzen, Thomas M. O.; Heald, George; Ross, Kathryn; Vernstrom, Tessa; Anderson, Gemma E.; Gaensler, Bryan M.; Johnston-Hollitt, Melanie; Kaplan, David L.; Riseley, Christopher J.; Tingay, Steven J.; Walker, Mia title: GaLactic and Extragalactic All-sky Murchison Widefield Array survey eXtended (GLEAM-X) I: Survey Description and Initial Data Release date: 2022-04-27 journal: nan DOI: nan sha: 198e96aa0dfe5bf31fe2daa2f2a250974480a8f7 doc_id: 630173 cord_uid: gn6as2c3 We describe a new low-frequency wideband radio survey of the southern sky. Observations covering 72 - 231 MHz and Declinations south of $+30^circ$ have been performed with the Murchison Widefield Array"extended"Phase II configuration over 2018 - 2020 and will be processed to form data products including continuum and polarisation images and mosaics, multi-frequency catalogues, transient search data, and ionospheric measurements. From a pilot field described in this work, we publish an initial data release covering 1,447 sq. deg over 4h 10 , and thus is not optimal for recovering the complex emission present in the Galactic Plane. However, the original GLEAM survey was recorded in an identical set of drift scan pointings to GLEAM-X, and at that time the array configuration provided many baselines with sensitivity to these larger angular scales. Thus, for the Galactic plane, we will jointly deconvolve the short baselines of GLEAM with the full GLEAM-X measurement sets, a process enabled by the fast GPU-based image-domain gridding extension to W S C l e a n (van der Tol et al., 2018) . This method has been used to great effect to image Fornax A and Centaurus A (McKinley et al., 2022) , and can also be used for other extended sources such as the Magellanic Clouds. An example of these results is shown in Fig. 3 and the full description of the process in the context of the Galactic Plane will be demonstrated in a further paper (Hurley-Walker et al. in prep) . The ionosphere introduces a λ 2 -dependent position shift to the observed radio sources, which varies with position on the sky. Following the method of Hurley-Walker et al. (2017) and Hurley-Walker et al. (2019b) , we use f i t s _ wa r p to calculate a model of position shifts based on the difference in positions between the sources in the snapshot and those in a reference catalogue, and then use this model to de-distort the images. For the reference catalogue, we benefit from using catalogues with similar resolution (∼1 ) covering wide areas. For declinations north of −30 • , we use NVSS at 1.4 GHz, and for the southern sky SUMSS at 843 MHz. From this combined catalogue we select a subset which is sparse (no internal matches within 3 ) and unresolved (integrated to peak flux density ratio of < 1.2). For each of the 7.68-MHz sub-bands and the wideband 30.72-MHz images formed from each observation, we estimate the background and RMS noise σ using B A N E, and perform source-finding using A e g e a n, with a minimum "seed" threshold of 5σ. Using the iterative catalogue cross-matching functionality of f i t s _ wa r p, we cross-match the measured sources to the reference catalogue, typically finding 1,000-3,000 cross-matches, from which we retain the 750 brightest sources. A greater number of sources does not improve the accuracy of the warping for typical ionospheric conditions, but does add computational load, so this value was chosen as a point of diminishing returns. These sources typically have flux densities > 1 Jy in the NVSS and SUMSS surveys so have adequate astrometry to form the baselines for our corrections. Snapshot images with fewer than 100 successful crossmatches are discarded (typically < 1 % of images). The position shifts in the remaining images are typically of order 25 -5 over 72-231 MHz, and are coherent on scales of 1-20 • , similar to previous studies with the MWA (e.g. Helmboldt & Hurley-Walker, 2020) . f i t s _ wa r p uses these position shifts to create a warp model, apply it to all pixels, and interpolate the results back on to the original pixel grid. This technique yields residual astrometric offsets (with no obvious preferred direction or structure) of order 6 at the lowest frequencies, and 2 at the highest frequencies. While the primary beam model developed by Sokolowski et al. (2017) is significantly more accurate than previous models of the MWA primary beam, there remain some discrepancies between our measured source flux densities and those predicted from existing work. In part, this is due to the flagging of individual dipoles in different tiles across the array, which gives these tiles a different and unmodelled primary beam response. For the observations processed in this work, 72 tiles were fully functional, 39 tiles contained one dead dipole, 14 contained two dead dipoles, and three tiles were flagged for having three or more non-functional dipoles. Including the effect of the flagging by computing and using multiple primary beams at the calibration and imaging stages is computationally expensive, so instead a correction is made after the images are formed. We cross-match each snapshot with a sparse (no internal matches within 5 ), unresolved (major axis a× minor axis b < 2 × 2 ) version of the GLEAM-derived catalogue used for calibration (Section 3.1) and make a global mean flux density scale correction using the f l u x _ wa r p 7 package (Duchesne et al., 2020) , typically of order 5-15 %. After this global shift, we accumulate the cross-matched tables. Since the discrepancy is consistent in Hour Angle and Dec between snapshots, we can combine the information in this frame of reference. For each frequency, as a function of HA and Dec, we compare the log 10 of the ratio R of the integrated flux densities of the measured source values and reference catalogue. Similarly to GLEAM ExGal, we find no trends with HA, and up to ±10 % trends in Dec. Fig. 4 shows this effect for a typical frequency band. A fifth-order polynomial model is fitted as a function of Dec using a weighted least-squares fit, where the weights are the signal-to-noise of the sources as measured in each snapshot. The standard deviation of the data from the model (σ poly ) is measured, and sources with |R| > 3 × σ poly are removed from the data. A final model of the same form is fitted to the remaining data, forming a correction function which is then applied to every individual snapshot. After correction, the primary-beam-corrected 30-MHz MFS images have snapshot RMS values of 35-4 mJy beam −1 over 72-231 MHz at their centres, where the primary beam sensitivity is highest. The upper panel shows the change in log 10 R as a function of Dec, where R is the ratio of measured integrated flux density to model integrated flux density. The lower panel shows the same after the polynomial correction function (blue line) has been fit and applied. The adjacent histogram shows the resulting distribution of log 10 R over the drift scan. The full-width-at-halfmaximum of the resulting histogram is ∼ 2.5 %. Similar results are obtained for other frequency bands. The goal of continuum mosaicking is to combine the astrometrically-and primary-beam-corrected snapshot images into deeper images with reduced noise, revealing fainter sources and diffuse structures invisible in the individual snapshots. For optimal signal-to-noise when mosaicking the night-long scans together, we use inversevariance weighting. The weight maps are derived from the square of primary beam model, scaled by the inverse of the square of the RMS of the center of the image, as calculated by BANE. As discussed in Section 2, GLEAM-X was observed at three different hour angles. This gives each drift scan slightly different (u, v)-coverage, which results in a slightly different restoring beam and thus point spread function (PSF). While each individual drift scan would have a unique and very nearly Gaussian PSF, it could be expected that a stacking of different unique Gaussians with different position angles would result in a non-Gaussian shape. Since most source-finders expect sources to be well-approximated by Gaussians, we tested this effect in our mosaicking procedure. We selected the scans with the most dissimilar (u, v)-coverage where there would be significant overlap in sources, those at HAs −1 and +1 from the Dec+2 scans, i.e. where the sky is rotating most quickly and projection effects are most important. We simulated a grid of 1 Jy point sources at common RA and Decs for seven observations from each of these scans, and ran them through our imaging and mosaicking stages, using unity image weighting and neglecting unnecessary astrometric and primary beam corrections. We used A e g e a n to source-find on the resulting mosaic, making corrections as necessary for the projection (Section 4). We recovered the sources at integrated flux densities of 0.995-0.999 Jy and peak flux densities of 0.96-0.97 Jy beam −1 . Subtracting these Gaussian fits from the image plane data, we found residuals at the < 4.5 % level, indicating that level of deviation away from Gaussianity. Since the integrated flux densities were recovered well, and the non-Gaussianity is fairly small, even for this worst-case scenario, we adopt this mosaicking method going forward. For each 7.68-MHz frequency channel, we form a nightlong drift scan, and examine it to check for any remaining data quality issues. We also form five 30.72-MHz bandwidth mosaics from the multi-frequency synthesis images generated during cleaning (Section 3.2). After quality checking, for each frequency, data from all four nights that cover the same RA range are combined together to make a single deep mosaic. At this stage, we also form a 60-MHz bandwidth "wideband" image over 170-231 MHz, as this gives a good compromise between sensitivity and resolution, and will be used for sourcefinding (Section 4). As described in Appendix A of Duchesne et al. (2020) , imaging away from the phase centre incurs a significant phase rotation during re-gridding as part of the mosaicking process. This re-projection results in a point-spread function that is not defined at the image reference coordinates. This is corrected partially by introduction of a projected regrid factor, f , that is applied to the PSF major axis to form an 'effective' PSF major axis. For a resultant ZEA projection this is simply related to the change in solid angle over the original SIN-projected image with (e.g. Thompson et al., 2001) where l and m are the direction cosines defined with reference to the original, SIN-projected image direction. The ZEA projection itself reduces additional area-related projection effects due to its equal area nature. This is used in initial source-finding on the mosaics as the integrated flux density is correct and the product of the major and minor PSF axes is also correct for the new projection. Residual uncorrected ionospheric distortions can cause slight blurring of the final mosaicked PSF. This can be characterised by examining sources which are known to be unresolved, which is best determined by using a higher-resolution catalogue than our calibration sky model; we thus use the NVSS and SUMSS combined catalogue described in Section 3.3. Following Hurley-Walker et al. (2017 , 2019b , we cross-match this catalogue with the sources detected in our mosaics at signalto-noise> 10, and then measure the size and shape of these sources in the GLEAM-X mosaics. We create a PSF map by averaging and interpolating over these sources, using Healpix (order= 4, i.e. pixels ∼3 • on each side) as a natural frame in which to accumulate and average source measurements. After the PSF map has been measured, its antecedent mosaic is multiplied by a (position-dependent) "blur" factor of where a rst and b rst are the FWHM of the major and minor axes of the restoring beam, and a PSF and b PSF are the FWHM of the major and minor axes of the PSF. This has the effect of normalising the flux density scale such that both peak and integrated flux densities agree, as long as the correct, position-dependent PSF is used . Values of B are typically 1.0-1.2. The mosaicking stage of Section 3.5 results in 26 mosaics: one with 60-MHz bandwidth, five with 30-MHz bandwidth and the other 20 covering 72-231MHz in 7.68-MHz narrow bands. In this work, we run the pipeline on four nights of observing indicated in Table A1 , producing a large set of mosaics with decreasing sensitivity toward the edges. Here we downselect to a region which is representative of the survey's eventual sensitivity, covering 4 h≤ RA≤ 13 h, −32.7 • ≤ Dec ≤ −20.7 • , for further analysis. Figs. 5-7 show this area for four of the deeper mosaics. Postage stamps of these images are available on both SkyView and the GLEAM-X website 8 . The header of every postage stamp contains the PSF information calculated in Section 3.6, and the completeness information calculated in Section 4.3. We use B A N E to determine the background and RMS noise of each mosaic. During development of this survey, we noticed that B A N E's default of three loops of 3-sigma-clipping is insufficient to exclude source-filled pixels to accurately determine the background and RMS noise. The issue may not have been noticed in previous works due to the relatively higher sensitivity and resulting source density of GLEAM-X (although Hurley-Walker et al. (2017) noted a similar effect from the high sidelobe confusion levels of GLEAM). We modified B A N E to use 10 loops and found that it produced more accurate noise and background estimates (see Section 4.2.2 for further analysis). Fig. 8 shows an example of 10 sq. deg of the 170-231 MHz wideband mosaic and associated background and RMS noise, as well as the same region as seen by GLEAM ExGal, in which the resolution is lower, the noise is higher, and the diffuse Galactic synchrotron on scales of > 1 • is visible. Combining data in the image plane may lead to the recovery of faint sources that were not cleaned during imaging. The RMS noise levels in the wide-band (30-MHz) mosaics range from 5-1.3 mJy beam −1 over 72-231 MHz. This compares to typical snapshot RMS values of 35-4 mJy beam −1 over the same frequency range (Section 3.4). Cleaning is performed down to 1-σ for components detected at 3-σ in a snapshot (Section 3.2). The centres of each image form the greatest contribution to each mosaic due to weighting by the square of the primary beam (Section 3.5). We can therefore estimate at what signal-to-noise threshold uncleaned sources will typically appear: 3×35 5 = 21 -3×4 1.3 = 9 from 72-231 MHz, and at ∼ 3×4 1 = 12 in the wideband (60-MHz) sourcefinding image (Section 4). Modelling this effect, especially in conjunction with Eddington bias (see e.g. Section 4.2.1), which is also significant at these faint flux densities, lies beyond the scope of this paper. It would involve significant work and is mainly of interest for performing careful measurement of low-frequency source counts (see Franzen et al., 2019, for an equivalent analysis for GLEAM). At this stage we merely suggest additional caution when using flux densities for sources at low (< 12) signal-to-noise. The mosaics at this stage are only a subset of the GLEAM-X sky. The RMS increases toward the edges due to the drop in primary beam sensitivity and selected RA range of these observations. Future mosaics comprised of further nights of observing will be combined to produce near-uniform sensitivity across the sky. A source catalogue derived from the images is a useful data product that enables straightforward crossmatching, spectral fitting, and population studies. We aim here to accurately capture components of sizes < 10 across all frequency bands, fitting elliptical twodimensional Gaussians with A e g e a n. We carry out this process on the 1,447-deg 2 region selected in Section 3.5, and the steps are generally applicable to future mosaics produced from the survey. We follow the same strategy as Hurley-Walker et al. (2017) : using the 170-231 MHz image, a deep wideband catalogue centred at 200 MHz is formed. We set the "seed" clip to four, i.e. pixels with flux density > 4σ are used as initial positions on which to fit components, where σ denotes the local RMS noise. After the sources are detected, we filter to retain only sources with integrated flux densities ≥ 5σ. We then use the "priorised" fitting technique to measure the flux densities of each source in the narrow-band images: the positions are fixed to those of the wide-band source-finding image, the shapes are predicted by convolving the shape in the source-finding image with the local PSF, and the flux density is allowed to vary. Where the sources are too faint to be fit, a forced measurement is carried out. We perform several checks on the quality of the catalogue, detailed below. In this Section we examine the errors reported in the catalogue. First, we examine the systematic flux density errors; then, we examine the noise properties of the wide-band source-finding image, as this must be close to Gaussian in order for sources to be accurately characterised, and for estimates of the reliability to be made, which we do in Section 4.3. Finally, we make an assessment of the catalogue's astrometric accuracy. These statistics are given in Table 1 . GLEAM forms the basis of the flux density calibration in this work, and in this Section we examine any differences between the flux densities measured here compared to those measured by GLEAM ExGal. We select compact sources from both catalogues (integrated / peak flux density < 2) that cross-match within a 15 radius, and have a good power-law spectral index fit (reduced χ 2 ≤ 1.93; see Section 4.4). Curved-and peaked-spectrum sources comprise only a small proportion of the catalogue and are more likely to be variable (Ross et al., 2021) , so are not included in this check. We excluded all sources in GLEAM-X data which have a cross-match within 2 in order to avoid selecting sources which are unresolved in GLEAM and resolve into multiple components in GLEAM-X. As surveys approach their detection limit, measured source flux densities are increasingly likely to be biased high due to noise; there are a larger number of faint sources available to be biased brighter by noise than there are bright sources available to be biased dimmer. Eddington (1913) describes corrections that can be made to an ensemble of measurements to remove this bias. For the purpose of this section, we wish to correct the individual GLEAM flux density measurements in order to check the GLEAM-X flux density scale. We use Eq. 4 of Hogg & Turner (1998) to predict the maximum likelihood true flux density of each of the GLEAM 200-MHz measurements: where σ is the local RMS noise, and q is the logarithmic source count slope (i.e. the index in dN dS ∝ S q ); at these flux density levels q = 1.54 (Franzen et al., 2016) . Fig. 9 plots the ratio of the two catalogue integrated flux densities as a function of signal-to-noise in GLEAM-X, with a correction applied to the GLEAM flux densities. The ratio trends toward 1.05 at higher flux densities, although the very brightest sources show only small discrepancies from unity. Since the effect is small, we do not attempt to correct for it here, but may revisit our data processing in future to see if it can be reduced, corrected, or eliminated. Since the flux density scale is tied to GLEAM, which has an 8 % error relative to other surveys, this value may be used as an error when combining the data with other work. No obvious trends are visible in the fitted spectral indices (Fig. 10) ; we note that the error bars on the GLEAM-X measurements are uniformly smaller due to the increased signal-to-noise of the data. We briefly examine the noise properties of the sourcefinding 170-231-MHz image. We use a 18 deg 2 region centered on RA 10 h 30 m Dec−27 • 30 with fairly typical source distribution. Following Hurley-Walker et al. (2017), we measure the background of the region using B A N E, and subtract it from the image. We then use A e R e s ("A e g e a n REsiduals") from the A e g e a n package to mask out all sources which were detected by A e g e a n, down to 0.2× the local RMS. We also use A e R e s to subtract the sources to show the magnitude of the residuals. Histograms of the remaining pixels are shown, for the unmasked and masked images, in Fig. 11 . The higher resolution of the GLEAM-X survey compared to GLEAM means that confusion forms a smaller fraction of the noise contribution, and thus the noise distribution is almost completely symmetric. Surveys close to the confusion limit will see a skew toward a more positive distribution, as seen by Hurley-Walker et al. (2017) . Noise and background maps are made available as part of the survey data release. Following Hurley-Walker et al. (2017), we measure the astrometry using the 200-MHz catalogue, as this provides the locations and morphologies of all sources. To determine the astrometry, high signal-to-noise (integrated flux density > 50σ) GLEAM-X sources are cross-matched with the isolated sparse NVSS and SUMSS catalogue (Section 3.3); the positions of sources in these catalogues are assumed to be correct and RA and Dec offsets are measured with respect to those positions. The average RA offset is +14 ± 700 mas, and the average Dec offset is +21 ± 687 mas (errors are 1 standard deviation). In 99 % of cases, fitting errors on the positions are larger than the measured average astrometric offsets. Given the scatter in the measurements, we do not attempt to make a correction for these offsets. As each snapshot has been corrected, residual errors should not vary on scales smaller than the size of the primary beam. Fig. 12 shows the density distribution of the astrometric offsets, and histograms of the RA and Dec offsets, which were used to calculate the values listed in this section. Following the same procedure as Hurley-Walker et al. (2017) , simulations are used to quantify the completeness of the source catalogue at 200 MHz, using the wideband mosaics. 26 realisations are used in which 25,000 simulated point sources of the same flux density were injected into the 170-231 MHz mosaics (at approximately 20 % of the true source density). The flux density of the simulated sources is different for each realisation, spanning the range 10 −3 to 10 −0.5 Jy in increments of 0.1 dex. The positions of the simulated sources are chosen randomly but not altered between realisations; to avoid introducing an artificial factor of confusion in the simulations, simulated sources are not permitted to lie within 5 of each other. Sources are injected into the mosaics using A e R e s. The major and minor axes of the simulated sources are set to a psf and b psf , respectively. For each realisation, the source-finding procedures described in Section 4 are applied to the mosaics and the fraction of simulated sources recovered is calculated. In cases where a simulated source is found to lie too close to a real (> 5σ) source to be detected separately, the simulated source is considered to be detected if the recovered source position is closer to the simulated rather than the real source position. This type of completeness simulation therefore accounts for sources that are omitted from the source-finding process through being too close to a brighter source. Fig. 13 shows the fraction of simulated sources recovered as a function of S 200MHz . The completeness is estimated to be 50 % at ∼ 5.6 mJy rising to 90 % at ∼ 10 mJy; these flux densities were typically below the RMS noise in GLEAM ExGal. Errors on the completeness estimate are derived assuming Poisson errors on the number of simulated sources detected. Fig. 14 shows the spatial distribution of the completeness for the work presented here; the slight dependence on RA is largely due to the presence of bright sources in large mosaics, e.g. Hydra A at ∼ RA 09 h 20 m Dec −12 • . The roll-off in Declination is due to the primary beam sensitivity of the single drift scan used in this work; in the full survey, multiple drift scans will be used to ensure near-uniform sensitivity and completeness across the sky. The completeness at any pixel position is given by C = N d /N s , where N s is the number of simulated sources in a circle of radius 6 • centred on the pixel and N d is the number of simulated sources that were detected above 5σ within this same region of sky. The completeness maps, in f i t s format, can be obtained from the supplementary material. Postage stamp images from the GLEAM-X VO server also include the estimated completeness at representative flux densities in their headers. To test the reliability of the source finder and check how many of the detected sources might be false detections, we use the same source-finding procedure as described above but search only for negative peaks. A e g e a n is run with a seedclip of 4σ (allowing for detections with peaks above this limit) and detections outside of the central region are cut. This initially yields 1,144 negative detections. Filtering the results to retain only sources with integrated flux densities S int > 5σ leaves 198 detections. Inspection revealed that some of these detections were artefacts around very bright sources, rather than noise peaks (see Fig. 15 ). There were also similar positive detections of artifacts around these bright sources. We filtered out any detections (positive or negative) that were • within 5 of a positive detection whose peak flux density was ≥ 2 Jy and where the absolute value of the ratio of the fainter peak to the bright peak was ≥ 350; or • within 12 of a positive detection whose peak flux density was ≥ 6 Jy and where the absolute value of the ratio of the fainter peak to the bright peak was ≥ 650. This accounts for the moderately bright artefacts closer in to the bright sources and fainter artefacts that can exist further out from very bright sources. This filtering cuts 157 positive detections and 149 negative detections. We also note that there is a tendency for negative sources to appear close to positive sources regardless of their brightness, potentially due to faint uncleaned sidelobes slightly reducing the map brightness very close to sources. These negative sources will not have positive counterparts, so potentially can also be filtered before estimating the reliability. The criterion in this case is that they cross-match with a positive source within 2 . An example is shown in Fig. 16 . These comprise a further 46 sources which may optionally be removed. Comparing the filtered samples of negative to positive detections, we can estimate the number of positive detections that are false detections as a function of signal to noise. For a conservative estimate, where we do not apply the second filter, we find that at a signal-to-noise ratio of five, the number of false detections is just under 2 %, falling quickly to 1 % for S int > 5.5σ. If we also filter negative sources that lie close to positive sources, we find that the reliability is much higher, with only 0.75 % of sources false at 5-σ, and rising to none at 8σ. For each significance bin, we convert these fractions to a reliability estimate and plot them as a function of signal-to-noise in Fig. 17 . We note that were the noise completely Gaussian, we would expect just one +5σ source in this sky area to appear purely by chance, and none with flux density > 5.5σ; i.e., a reliability of 99.999 % in the faintest bin, rising quickly 100 %. We fit two models to the twenty narrow-band flux density measurements for all detected sources (using S ∝ ν α ). The first model is a simple power-law parameterised as where S ν0 is the brightness of the source, in Jy, at the reference frequency ν 0 , and α describes the gradient of the spectral slope in logarithmic space. We also extend this power-law model to, which includes the additional free parameter q to capture any higher order spectral curvature features, where increasing |q| captures stronger deviations from a simple power law; if q is positive, the curve is opening upward (convex) and if q is negative, the curve is opening downward (concave). This model is not physically motivated, and may not appropriately describe sources with different power-law slopes in the optically thin and thick regimes, but provides a useful filter to identify interesting sources. For both models we set ν 0 to 200-MHz. To perform accurate spectral fitting, the errors on the flux density measurements must be known. Following Hurley-Walker et al. (2017), spectral fitting allows us to check the flux density consistency of the catalogue. A flux density scaling error of 2 % yields a median reduced χ 2 of unity across the catalogue, whereas higher or lower values bias the reduced χ 2 lower or higher as a function of signal-to-noise. We thus adopt 2 % as the measure The white circle shows a negative source that was not filtered, while the white × shows a negative source that was filtered for being too close to a bright source. Figure 16 . An example of a negative source found next to a positive source that could optionally be filtered when generating the reliability estimate. Black circles indicate detected positive components that are not filtered; the white + shows a negative source that can optionally be filtered. Figure 17 . Estimates of the reliability of the catalogue as a function of signal to noise. The lower blue curve shows a conservative estimate without filtering negative sources detected on the edges of positive sources. The upper red curve shows a more generous estimate derived after filtering these sources out. In comparison, GLEAM ExGal has a reliability of 98.9 %-99.8 % at these signal-to-noise levels. of our internal flux density scale, and set the errors on the flux density to this value added in quadrature with the local fitting error from A e g e a n. (Note that 8 % is more appropriate when comparing with other catalogues as this is the flux density scale accuracy of GLEAM, to which GLEAM-X is tied (see Section 4.2.1).) We applied the Levenberg-Marquardt non-linear leastsquares regression algorithm (as implemented in the s c i p y p y t h o n module; Virtanen et al., 2020) to Equations 4 and 5 for each detected source. We did not include narrow-bands with negative integrated flux density measurements. We discarded the fitting results if • there were fewer than 15 integrated flux density measurements for a source; • a χ 2 goodness-of-fit test indicated at a > 99 % likelihood of an incorrectly-fit model; or • q/∆q < 3, to ensure constrained deviations from a power-law are statistically significant. For this initial data release we included only the model with the lower reduced-χ 2 statistic in our catalogue. Applying these criteria a total of 70,432 and 888 source components have fitting results recorded for power-law and curved power-law models, respectively. Fig. 19 shows five example SEDs, four with either power-law or curved power-law models constrained using exclusively GLEAM-X, and one with GLEAM-X data supplemented with data from SUMSS and NVSS to fit a two-component power-law model described as where S p is the brightness (Jy) at the peak frequency ν p (MHz), and α thin and α thick are the spectral slopes in the optically thin and optically thick regimes, respectively (Callingham et al., 2017) . For sources fit well by power-law SEDs, the distributions of spectral indices α with respect to flux density are plotted in in Fig. 18 . The median α for the brightest bin is −0.83, in excellent agreement with previous results (e.g. Mauch et al., 2003; Lane et al., 2014; Heald et al., 2015) . The priorised fitting routine in a e g e a n separates the island finding stage from the component characterisation stage, and is analogous to aperture photometry in optical images . We use this in GLEAM-X to ensure that each radio-component iden- tified in our deep 170-231 MHz source finding image has an equivalent component characterisation in each of the other 25 GLEAM-X images. This process however does not enforce spectral smoothness between images adjacent in frequency. For GLEAM-X, this process becomes less reliable towards lower frequencies, where the PSF becomes large enough that nearby components are blended to the point where their brightness profiles can not be distinguished. Although model optimisation methods may be able to constrain the total brightness across all components, the brightness between individual components become degenerate. We highlight an example of this behaviour in Figure 20 . This problem is most apparent for sources that are slightly resolved and characterised as two separate components within 120 from one another. Further development of a e g e a n to perform component characterisation across all images jointly while including physically-motivated parametisation of the spectra is planned to address this issue. The resulting catalogue consists of 78,967 radio sources detected over 1,447 deg 2 . 71,320 sources are fit well by power-law or curved-spectrum SEDs. The catalogue has 722 columns (see Appendix B) and is available via Vizier. The catalogue measurements can be used to perform more complex spectral fits, especially in conjunction with other radio measurements. Table 1 shows the properties of the images and catalogue in this data release, as well as some forward predictions for the full survey, in comparison to GLEAM. The total data volume of GLEAM-X visibilities is large (∼ 2 PB) and file transfer operations comprise a significant proportion (∼ 40 %) of our processing time. When processing the data, each observation takes up ∼ 100 GB of disk space in visibilities, images, and metadata. Given the richness of the GLEAM-X survey, we are strongly motivated to perform additional operations on the data while they reside on disk in order to avoid moving the data more frequently. In this section we discuss the current extensions to the pipeline that we expect will yield a range of science outcomes not possible with mosaicked images. The wide field-of-view of the MWA combined with the repeated drift scanning strategy of GLEAM-X yields a dataset that is interesting to search for transient ra- gle transient candidate, but understanding its nature was difficult with the (limited) data available. Historically this has been a common occurrence for low-frequency radio transients, with many unusual phenomena detected but never fully understood (e.g. Hyman et al., 2005; Stewart et al., 2016; Varghese et al., 2019) . The GLEAM-X drift scans were observed such that the LST was matched for repeated observations at the same pointing and frequency. This enabled a search using "visibility differencing", wherein calibrated measurement sets were differenced, and the resulting nearly-empty visibilities were inverted to form a dirty image, which could be used to search for transient sources (Honours thesis: O'Doherty 2022; Hancock et al. in prep.) . One high-significance candidate was followed up using the large MWA archive, resulting in the discovery of a new type of highly polarised radio transient, repeating on the unusual timescale of 18.18 minutes (Hurley-Walker et al., 2022) . The wide bandwidth of GLEAM-X was key to finding the dispersion measure of the source, and therefore estimating its distance. The visibility differencing approach resulted in a large number of false positives due to the differences in ionospheric conditions between observations. The discovery of a new type of radio transient, and the utility of our polarisation and wideband measurements, motivates the inclusion of a transient imaging step in our routine pipeline processing. Our approach is to image every 4-s interval of each observation, at the same time subtracting the deep model that was formed during imaging (Section 3.7), the same approach that is currently used for imaging MWA interplanetary scintillation observations (Morgan et al. in prep.) . This results in a thermal-noise-dominated Stokes-I image cube where only differences between each time step and the continuum average are recorded. This cube is then stored in an HDF5 file 9 as described in Appendix 2 of Morgan et al. (2018) . Briefly, the image cube is reordered so that time is the fastest axis, and the pixel data is demoted to half precision (16-bit) floats. This results in a typical data volume of 600 MB per observation. Once in this format, any number of algorithms can be conveniently applied to detect and measure time-domain signals. While imaging every 0.5-s sample would be ideal, it would multiply by 8× the storage and processing requirements for all other steps of the pipeline, but if a signal of interest is discovered then it is simple (and indeed necessary) to reprocess the data with higher time (and, if needed, frequency) resolution. Future data releases will provide these data and quantitative analyses thereof. The source position offsets determined during the dewarping process (Section 3.3) yield information about the slant total electron content (dTEC) averaged over the telescope array projected on to the sky in that fieldof-view. If dTEC varies significantly over the array, the wavefronts from different parts of the sky will arrive at different times, and radio sources will appear stretched, duplicated, or will disappear completely. Conversely, if images are created using sub-arrays of the telescope, the apparent difference in source positions can be used to constrain an approximate height of the distorting screen (Loi et al., 2015; Helmboldt & Hurley-Walker, 2020) . We thus add a module to the imaging pipeline to routinely produce these binocular images. In choosing the sub-arrays from the extended Phase I I, we face a compromise between sensitivity (higher for large sub-arrays) and parallax lever-arm (better for widely-separated sub-arrays). Additionally we have no prior knowledge of what ionospheric activity will be observed on the night, nor the resources to adjust the imaging to match at the time of processing. To form a generally useful product, we split the array into two pairs of sub-arrays following the cardinal directions, shown in Fig. 21 . Each group of 43 or 44 antennas is imaged separately, and source-finding is performed using the default settings of A e g e a n. These catalogues can form a useful input to future analyses of the ionosphere above the Murchison Radioastronomy Observatory; the data and analysis will be released in future work. In this work we described GLEAM-X, a new wideband low-frequency all-southern-sky survey performed using the MWA, as well as the data reduction steps we expect to use to produce a range of continuum data products over 72-231 MHz. Polarisation data will be described in the upcoming paper by Zhang et al. (in prep) . Extensions to our data reduction pipeline to perform transient searches (Section 5.1) and binocular imaging (Sec-tion 5.2), as well as joint deconvolution of the Galactic Plane (Fig. 3) will further enhance the capabilities of the survey. To demonstrate the quality and attributes of the images and catalogues that will be produced by GLEAM-X, we release here 1,447 deg 2 of sky in the form of 26 mosaics across 72-231 MHz of bandwidths 60, 30, and 8 MHz, with RMS noises ranging from 15 to just over 1 mJy beam −1 . Additionally, we form a catalogue of 78,967 sources, 70,432 of which are well-fit across our band with power-law spectral energy distributions, and 888 with curved power-law spectra. Extrapolating our source density of 55 deg −2 to the ∼31,000 deg 2 that GLEAM-X will eventually cover, we expect to detect of order 1.7 M sources, and produce ∼ 1.5 M radio spectra. We plan to release the survey in a series of data releases; the next will comprise a large (∼ 15,000 deg 2 ) set of images and catalogues covering the southern extragalactic sky centered on the South Galactic Pole (Galvin et al. in prep) ; secondly we aim to process and release the complete Galactic Plane (Hurley-Walker et al. in prep) ; finally, we will aim to produce contiguous all-sky coverage. Polarisation, transient, and ionospheric data releases and analyses will also proceed over coming years. These data will enable a range of science outcomes, some of which are outlined by Beardsley et al. (2019) in their review of scientific opportunities with Phase I I of the MWA. For instance, there is strong potential to detect 10 4 peaked-spectrum sources in GLEAM-X data, an order of magnitude more than discovered by GLEAM (Callingham et al., 2017) , and also probing a population an order of magnitude fainter. Improved signal-to-noise on sources with curved and peaked spectra can provide more efficient selection of high-redshift radio galaxies (Drouart et al., 2020) . Many local star-forming galaxies will be resolved, enabling better understanding of the interplay between thermal and non-thermal processes in their energy budgets (Kapińska et al., 2017; Galvin et al., 2018) . The extended configuration of the Phase I I MWA has already been used very capably for targeted investigations of the extragalactic sky, such as determining the remnant radio galaxy fraction in one of the Galaxy and Mass Assembly fields (Quici et al., 2021) and detecting diffuse non-thermal emission in galaxy clusters (Duchesne et al., 2021) . Similar studies over the whole sky, particularly exploiting synergies with other recent wide-area surveys such as RACS, are likely to be highly productive. The higher source density of GLEAM-X will for the first time enable cosmological measurements with the MWA. We can resolve the tension between the angular clustering observed with NVSS and TGSS-ADR1 (Dolfi et al., 2019) , investigate differential source counts (Chen & Schwarz, 2015) , and by cross-correlating with measurements of the Cosmic Microwave Background, search for the effects of dark energy via the integrated Sachs-Wolfe effect (Sachs & Wolfe, 1967) . Additionally, GLEAM-X may help to improve sky models for studies of the Epoch of Reionisation, by measuring source brightnesses below 100 MHz, imaging slightly deeper, and separating sources into more components than LoBES (Lynch et al., 2021) . Continuum Galactic science shows promise with MWA Phase I I (Tremblay et al., 2022) , and given the excellent results from our initial exploration of jointly deconvolving GLEAM and GLEAM-X, we expect to make new detections of supernova remnants (SNRs; see e.g. Hurley-Walker et al., 2019a) and improve measurements of cosmic ray electrons in the Galactic Plane (following Su et al., 2018) . Additionally the improved resolution, sensitivity, and wide bandwidth will make possible the examination of the unshocked ejecta of SNRs (Arias et al., 2018) and interactions with their environments (Castelletti et al., 2021) via measurements of low-frequency thermal absorption. This creates excellent synergy with TeV observations by the High Energy Stereoscopic System (Hinton & HESS Collaboration, 2004; Aharonian et al., 2006) and the upcoming Cherenkov Telescope Array (Acharya et al., 2013) to search for sites of cosmic ray acceleration in our Galaxy (e.g. Maxted et al., 2019) . The repeated, overlapping epochs of GLEAM-X and its drift scan observing strategy make it possible to explore radio transients and variability on timescales from seconds to years; comparisons to GLEAM enable a seven-year lever arm. Combining these cadences with the wide bandwidth will enable improved investigation of the startling variability of peaked-spectrum sources found by Ross et al. (2021) , and enable distance measurements for dispersion-smeared pulsed transients (Hurley-Walker et al., 2022) . As evinced by the latter work, GLEAM-X opens new parameter space in the low-frequency radio sky, and potentially enables further serendipitous discoveries beyond our ability to predict. We thank the anonymous referee for their comments, which improved the quality of this paper. NHW is supported by an Australian Research Council Future Fellowship (project number FT190100231) funded by the Australian Government. KR acknowledges a Doctoral Scholarship and an Australian Government Research Training Programme scholarship administered through Curtin University. DK was supported by NSF grant AST-1816492. CJR acknowledges financial support from the ERC Starting Grant 'DRANOEL', number 714245. This scientific work makes use of the Murchison Radio-astronomy Observatory, operated by CSIRO. We acknowledge the Wajarri Yamatji people as the traditional owners of the Observatory site. Support for the operation of the MWA is provided by the Australian Government (NCRIS), under a contract to Curtin University administered by Astronomy Australia Limited. Establishment of the Murchison Radio-astronomy Observatory and the Pawsey Supercomputing Centre are initiatives of the Australian Government, with support from the Government of Western Australia and the Science and Industry Endowment Fund. We acknowledge the Pawsey Supercomputing Centre which is supported by the Western Australian and Australian Governments and the China SKA Regional Center prototype at Shanghai Astronomical Observatory which is funded by the Ministry of Science and Technology of China (under grant number 2018YFA0404603) and Chinese Academy of Sciences (under grant number 114231KYSB20170003). Access to Pawsey Data Storage Services is governed by a Data Storage and Management Policy (DSMP). ASVO has received funding from the Australian Commonwealth Government through the National eResearch Collaboration Tools and Resources (NeCTAR) Project, the Australian National Data Service (ANDS), and the National Collaborative Research Infrastructure Strategy. This paper makes use of services or code that have been provided by AAO Data Central (datacentral.org.au) . This research has made use of NASA's Astrophysics Data System Bibliographic Services. The following software was used in this work: ao f l ag g e r and c o t t e r (Offringa et al., 2012) ; W S C l e a n (Offringa et al., 2014; Offringa & Smirnov, 2017) ; A e g e a n ; m i r i a d (Sault et al., 1995) ; T o p C at (Taylor, 2005) N u m P y (Dubois et al., 1996; Harris et al., 2020) ; A s t ro P y (Astropy Collaboration et al., 2013) ; S c i P y (Oliphant, 2007) , M at p l o t l i b (Hunter, 2007) . This work was compiled in the very useful online L A T E X editor Overleaf. Table A1 GLEAM-X observing summary. The HA and Dec are fixed to the locations shown and the sky drifts past for the observing time shown. Observations typically start just after sunset and stop just before sunrise. The four nights published in this work are shown in bold font. Nights identified as having high ionospheric activity are marked with a "*". −1 −12 7.9 2020-10-03 −1 +1 9.8 2020-10-04* −1 +20 9.8 2020-10-05 0 −71 9.8 2020-10-06 0 −55 9.8 2020-10-07 0 −40 9.8 2020-10-08 0 −26 9.8 2020-10-09 0 −12 9.6 2020-10-10* 0 +1 8.8 2020-10-11 0 +20 9.8 2020-10-12 0 −71 8.6 2020-10-13 0 −55 8.4 2020-10-14 0 −40 5.6 2020-10-15 0 −26 9.1 2020-10-16 0 −12 9.0 2020-10-17* 0 +1 9.8 2020-10-18 0 +20 9.7 2020-10-19 +1 −71 9.5 2020-10-20 +1 −55 9.5 2020-10-21 +1 −40 9.5 2020-10-22 +1 −26 8.2 2020-10-23 +1 −12 9.5 2020-10-24* +1 +1 9.5 2020-10-25 +1 +20 8.0 Total: 1,056.5 Table A2 Column numbers, names, and units for the catalogue. Source names follow International Astronomical Union naming conventions for co-ordinate-based naming. Background and RMS measurements were performed by B A N E (Section 3.7); PSF measurements were peformed using in-house software as described in Section 3.6; the fitted spectral index parameters were derived as described in Section 4.4; all other measurements were made using A e g e a n. A e g e a n incorporates a constrained fitting algorithm. Shape parameters with an error of −1 indicate that the reported value is equal to either the upper or lower fitting constraint. The columns with the subscript "wide" are derived from the 200 MHz wide-band image. Subsequently, the subscript indicates the central frequency of the measurement, in MHz. These sub-band measurements are made using the priorised fitting mode of Aegean, where the position and shape of the source are determined from the wide-band image, and only the flux density is fitted (see Section 4.1). Note therefore that some columns in the priorised fit do not have error bars, because they are linearly propagated from the wideband image values (e.g. major axis a). Number Name Unit Description Astronomical Society of the Pacific Conference Series The New Mexico Institute of Mining and Technology, Socorro Astronomical Data Analysis Software and Systems IV. p Astronomical Data Analysis Software and Systems XIV Interferometry and Synthesis in Radio Astronomy