key: cord-0284931-f5104ts8 authors: Shivaei, Irene; Popping, Gergo; Rieke, George; Reddy, Naveen; Pope, Alexandra; Kennicutt, Robert; Mobasher, Bahram; Coil, Alison; Fudamoto, Yoshinobu; Kriek, Mariska; Lyu, Jianwei; Oesch, Pascal; Sanders, Ryan; Shapley, Alice; Siana, Brian title: IR SED and Dust Masses of Sub-solar Metallicity Galaxies at z~2.3 date: 2022-01-12 journal: nan DOI: 10.3847/1538-4357/ac54a9 sha: afeb32d4a8c76db9896787fc5ba9cc15d6c9d531 doc_id: 284931 cord_uid: f5104ts8 We present results from ALMA 1.2mm continuum observations of a sample of 27 star-forming galaxies at z=2.1-2.5 from the MOSFIRE Deep Evolution Field (MOSDEF) survey. These galaxies have gas-phase metallicity and star-formation rate measurements from Hb, [OIII], Ha, and [NII]. Using stacks of Spitzer, Herschel, and ALMA photometry (rest-frame ~ 8-400$mu$m), we examine the IR SED of high-redshift subsolar metallicity (~0.5 $Z_{odot}$) LIRGs. We find that the data agree well with an average SED template of higher luminosity local low-metallicity dwarf galaxies (reduced $chi^2$ of 1.8). When compared with the commonly used templates for solar-metallicity local galaxies or high-redshift LIRGs and ULIRGs, even in the most favorable case (with reduced $chi^2$ of 2.8), the templates are rejected at>98% confidence level. The broader and hotter IR SED of both the local dwarfs and high-redshift subsolar metallicity galaxies may result from different grain properties, a clumpy dust geometry, or a harder/more intense ionizing radiation field that heats the dust to higher temperatures. The obscured SFR indicated by the FIR emission of the subsolar metallicity galaxies is only ~ 60% of the total SFR, which is considerably lower than that of the local LIRGs with ~ 96-97% obscured fractions. Due to the evolving IR SED shape, the local LIRG templates fit to mid-IR data can overestimate the Rayleigh-Jeans tail measurements at z~2 by a factor of 2-20, and these templates underestimate IR luminosities if fit to the observed ALMA fluxes by>0.4dex. At a given stellar mass or metallicity, dust masses at z~2.3 are an order of magnitude higher than those at z~0. Given the predicted molecular gas mass fractions, the observed z~2.3 dust-to-stellar mass ratios suggest lower dust-to-molecular gas masses than in local galaxies at the same metallicity. The infrared (IR) emission of dust in galaxies accounts for a significant fraction of their bolometric luminosity and encodes critical clues to how it is produced. By mass, dust only represents ∼ 1% of the ISM in typical galaxies. However, it reshapes galaxy spectral energy distributions (SEDs) by attenuating and absorbing UVoptical photons and reradiating that energy in the IR. The resulting IR emission accounts for approximately half of the cosmic extragalactic background (Dole et al. 2006; Finke et al. 2010) , and the bulk of the cosmic star formation at z ∼ 0 − 3 is detected in the IR (Madau & Dickinson 2014; Planck Collaboration et al. 2014; Casey et al. 2018) . IR SEDs consist of a roughly Planckian and featureless far-IR (FIR) component plus emission features of aromatic molecules in the ∼ 6 − 20 µm range. While the mid-IR spectra (λ = 5 − 30 µm) are dominated by the emission from small grains that are stochastically heated by single photons, the longer wavelength FIR and submm emission comes from larger grains that are in thermal equilibrium. The shape of the FIR/submm SED depends on the dust composition (which determines the submm spectral slope), the distribution of radiation field intensities on the dust, and the dust grain size distribution, which together determine the peak and width of the IR SED. In a comprehensive study of local galaxies, Rémy-Ruyer et al. (2015) showed that while there are many commonalities, distinct differences exist in the IR SEDs of low-metallicity dwarfs and metal-rich starforming local galaxies. They found that, on average, the low-metallicity galaxies have broader IR SEDs that peak at shorter wavelengths compared to those of the metalrich galaxies. Rémy-Ruyer et al. (2015) attributed the differences in the IR SEDs to a wider range of interstellar radiation field intensities (σU ), with a higher average radiation field intensity ( U 15 ) in low-metallicity dwarfs due to their high specific star formation rates (sSFRs; sSFR= SFR/M * ). In a sample of local galaxies with oxygen abundances of 12 + log(O/H)∼ 8.2 − 8.8, Cortese et al. (2014) also found a strong correlation between the wavelength-dependent emissivity index of the dust (β) and metallicity, with only a weaker anti-correlation with sSFR. These results illustrate that the effects of metallicity and of the radiation field on the integrated IR SEDs of galaxies cannot be easily separated, as the two parameters themselves are interconnected. Low-metallicity systems are often dominated by young stellar populations that have relatively hard and intense radiation fields. They also have lower dust-to-gas mass ratios (e.g., Draine et al. 2007b; Rémy-Ruyer et al. 2014 ) that make their ISM more transparent to the stellar radiation, allowing massive star formation to impact the ISM over a larger volume. These effects make the ISM metallicity a good tracer of dust evolution processes in galaxies, as it traces both the elemental abundances in the ISM and the interstellar radiation field intensity that the dust grains are exposed to. However, there are very few studies of the effect of metallicity on dust emission properties outside the local Universe, due both to the sensitivity limitations of the available IR-submm data and to the lack of robust metallicity measurements. Recent studies have shown that at a given stellar continuum reddening (UV slope), the IR to UV luminosity ratio (IRX) of z ∼ 2 starforming galaxies correlates with metallicity in the range of 12 + log(O/H)∼ 8.3 − 8.6 (Shivaei et al. 2020a) or with stellar mass (Reddy et al. 2018a; Fudamoto et al. 2019) . This may indicate a change in grain properties or dust-star geometry with metallicity. Moreover, in Shivaei et al. (2017) we showed that the mid-IR aromatic (PAH) emission relative to the total IR luminosity of z ∼ 2 galaxies decreases at 12 + log(O/H) 8.3 − 8.4, similar to the behavior seen in the local Universe but at 12 + log(O/H)≤ 8.2 (Engelbracht et al. 2005; Draine et al. 2007b; Hunt et al. 2010; Marble et al. 2010; Li 2020 , among many more). These studies suggest that the emission properties of dust at z ∼ 2 vary significantly between solar and subsolar metallicity galaxies. The main goal of this work is to investigate the FIR emission of z ∼ 2.3 subsolar metallicity galaxies and assess whether the broadening of the FIR SED seen locally in low metallicity galaxies occurs at sub-solar metallicities at high redshifts. Such behavior, if confirmed, would provide valuable insight into the dust and gas properties of high redshift galaxies. Additionally, it will have implications for the calibration of the JWST, ALMA, and other IR/submm measurements interpreted as SFR, IR luminosity, and dust mass indicators. To reach this goal 15 The intensity U of the interstellar radiation field (Dale et al. 2001 ) determines both the shape and normalization of the IR SED. deep observations across the wavelengths from mid-IR to submm are required to detect the IR emission of the lower luminosity, and hence lower metallicity, galaxies at high redshifts. In this work, we present the first results of a targeted ALMA band-6 (1.2 mm) continuum survey, tracing restframe submm emission of a sample of 27 star-forming galaxies at z = 2.1 − 2.5 drawn from the MOSFIRE Deep Evolution Field (MOSDEF) survey . The galaxies have robust metallicity and starformation rate (SFR) measurements from optical emission lines (Hα, Hβ, [Oiii] , [Nii] ), and span a wide range in oxygen abundance from 12 + log(O/H)= 8.1 to 8.8. The availability of this detailed prior information enables this targeted survey of faint subsolar metallicity galaxies with ALMA. Compared to the local galaxies with similar metallicities, the z ∼ 2.3 galaxies have higher SFRs and SFR surface densities, which make this analysis unique. ALMA band-6 traces the Rayleigh Jeans (RJ) tail of the dust emission at rest-frame ∼ 360 µm for our sample. Combining the ALMA data with the shorter wavelength Spitzer and Herschel data at rest-frame 7 to 100 µm, we can test whether the relatively broad FIR SEDs seen in local low metallicity galaxies are also typical at z > 1. By constraining the change in IR/submm SEDs with metallicity, we provide insight into the integrated dust properties of high-redshift galaxies (such as IR luminosity and obscured SFR), determine the masses of dust, study the dust mass fraction (dust-to-stellar mass) versus metal fraction (aka metallicity) at z ∼ 2.3, and explore its redshift evolution by incorporating various z ∼ 0 surveys from the literature. Our results extend over 0.7 dex in metallicity and down to stellar masses of ∼ 10 9.7 M , exploring a new parameter space for galaxy evolution studies at these redshifts. The outline of the paper is as follows. In Section 2, we present the sample and data, and explain the data reduction. Section 3 presents the analysis and construction of IR SEDs. Section 4 discusses outcome of the IR SED analysis, how it changes with metallicity and starformation properties, and its physical interpretation. In Section 5, we discuss the implications for measuring IR luminosity and SFR, as well as predicting submm fluxes based on shorter wavelength data. The evolution of dust mass fraction versus redshift and metallicity, and the comparison of dust-to-gas mass ratios with local galaxies are covered in Section 5.4. Our results are summarized in Section 6. We discuss the different models and IR templates that are used to fit the data in Appendix A. The systematic uncertainties in calculating dust masses from submm fluxes are discussed in Appendix B. A cosmology with H 0 = 70 km s −1 Mpc −1 , Ω Λ = 0.7, and Ω m = 0.3, and a Chabrier (2003) IMF are adopted. Our sample is selected from the MOSDEF near-IR spectroscopic survey . MOSDEF is a survey of ∼ 1500 galaxies at z = 1.3 − 3.8 with near-IR spectra from the MOSFIRE spectrometer on Keck, primarily in the COSMOS, GOODS-N and AEGIS fields (a small fraction of the MOSDEF galaxies is located in the GOODS-S and UDS fields). Using the catalogs of the 3D-HST survey (Skelton et al. 2014) , the MOSDEF parent sample was selected based on the available photometric and spectroscopic redshifts in three redshift windows of z = 1.4 − 1.7, z = 2.1 − 2.6, and z = 3.0 − 3.8 down to H-band (AB) magnitudes of 24.0, 24.5, and 25.0, respectively. In these redshift ranges, most of the prominent optical emission lines are observed in the Y JHK bands of MOSFIRE. The ALMA survey presented here (program 2019.1.01142.S; PI: I. Shivaei) targeted MOSDEF galaxies in the COSMOS field for accessibility with ALMA. There are 629 MOSDEF-targeted galaxies in this field, of which 209 are within the correct redshift range (2.0 < z < 2.5), such that the ALMA 1.2 mm and Spitzer MIPS 24 µm spectral bands trace the cold dust and 7.7 µm PAH emission, respectively. We further required (1) > 3σ sigma detections in detection in the Hα, Hβ, [Oiii] , and [Nii] lines to ensure robust estimates of SFR and metallicity Sanders et al. 2015; Shapley et al. 2015) , which narrowed down the sample to 51; (2) unconfused MIPS 24 µm photometry (Shivaei et al. 2017) ; (3) no evidence for AGN, based on X-ray emission, IRAC colors, and [Nii]/Hα line ratios (Coil et al. 2015; Azadi et al. 2017 Azadi et al. , 2018 Leung et al. 2019) ; and (4) rest-frame U − V and V − J colors consistent those of star forming galaxies (Zick et al. 2018) . Of the 40 galaxies remaining, we select the final sample of 27 to represent a wide range in mass and metallicity. The mass, metallicity, and sSFR distributions of galaxies in our sample are shown in Figure 1 . The MOSDEF parent sample is largely representative of typical main-sequence star-forming galaxies (e.g., Shivaei et al. 2015b ). However, as the parent sample is selected based on rest-frame optical photometry, there is a bias towards galaxies with redder UV and optical colors at a given UV magnitude compared to UV-selected samples (Reddy et al. 2018b ). On the other hand, the subsample that is selected for ALMA followup in this work is selected based on strong detection in multiple optical emission lines, and may exclude highly dust-obscured galaxies with simple dust-star geometries 16 . The wellmeasured emission lines allow use of the Balmer decrement (Hα to Hβ flux ratio) to derive corrections for reddening of the nebular lines to recover the total (intrinsic) nebular emission. The success of this approach for a Hα-Hβ detected sample is illustrated in Shivaei et al. (2016) who compare Hα SFR estimates corrected for reddening via the Balmer decrement with IR-based SFR estimates. The ALMA continuum observations in band 6 (with a representative frequency of 242 GHz) were executed in three different blocks with the 12m array in C43-2 and C43-3 configurations. Different depths were requested for the three blocks that included subsamples with different masses and metallicities, based on the expectation of what would have been detectable. The high mass/high metallicity bin, M * > 10 10 M and 12 + log(O/H) PP04 . The Pettini & Pagel (2004) calibration estimates ∼ 0.15 dex lower metallicities from the same O3N2 line ratios. The Pettini & Pagel (2004) Data are calibrated and imaged with the Common As- 17 We initially adopted three mass-metallicity bins for sensitivity calculations, however, later in Sections 3 and 4 we simply divide the full sample into two subsolar-and solar-metallicity bins to increase the SNR in the Spitzer and Herschel stacks. We go back to three bins in Section 5.4 again, as the analysis is based on the ALMA data alone. tronomy Software Applications (CASA) package (Mc-Mullin et al. 2007 ). We create a cleaned image using the tclean package with natural weighting and the multifrequency synthesis (mfs) mode, and we apply tapering, which by suppressing the weights of the outer visibilities increases the beam size (lowers the resolution). The reason for using tapering is the effect of the non-Gaussian nature of the dirty beam on flux estimation (see Czekala et al. 2021 and the appendix in Jorsater & van Moorsel 1995 for a complete discussion). In brief, in the tclean process, the clean model is convolved with the clean beam and has units of Jy/clean beam, while the dirty image and the residual map that are created after running the tclean task are in units of Jy/dirty beam. If the dirty beam is a Gaussian, the two beams would have roughly the same area. However, if the dirty beam has shelves, as in the case of our data, the integrated dirty beam response would be larger than the integrated clean beam response. Therefore, in the latter case, once the residual map is added back to the convolved clean model, the fluxes will be overestimated. This effect is more pronounced for low signal-to-noise ratio (SNR) data, like ours, as the fraction of the flux in the residual map is higher. The solution can be either to convolve the clean model with a larger beam that matches the area of the dirty beam, or to scale the residuals following the prescription of Jorsater & van Moorsel (1995) . The former sacrifices the resolution for correct flux and noise estimates. The latter retains the resolution, however it scales down the noise outside the regions containing emission, resulting in an incorrect noise determination (Walter et al. 2008) . Given that for this analysis we are not concerned about the resolution, we adopt the first solution by using tapered images, yielding a synthesized beam with FWHM of 1.4 × 1.4 . In our sample, the fluxes estimated from the untapered natural-weighted images are higher by ∼ 15% than the fluxes derived from the tapered (unaffected) images. This value is in agreement with the flux overestimate percentage calculated using the prescription of Jorsater & van Moorsel (1995) . Fluxes are extracted through aperture photometry, performed on the primary-beam-corrected images. We convert the image units from Jy/beam to Jy/pix using the number of pixels per beam (calculated from the image header keywords; bmajor, bminor, cdelt1, cdelt2). Then, the total integrated aperture flux is calculated as the sum of the pixel values in a given aperture. An aperture radius of 1.45 is used to include > 99% of the pointsource Gaussian area (1.45 is 2.1 times the half width at half maximum of the Gaussian beam). The aperture fluxes are cross-checked for consistency with those derived from 2D fitting on tapered images through CASA. We do not attempt to derive fluxes from the peak signal, as peak fluxes are highly uncertain for low SNR objects such as the ones in our sample. Since the targets are all at the center of the images, the non-primary beam-corrected maps are used for error measurements (e.g., as in Betti et al. 2019) . The noise for the integrated flux measurements is estimated by taking the standard deviation of the integrated flux measurements in 100 apertures of the same size and offset randomly from the source position. Given the prior knowledge on the location of the galaxies in HST/F160W images 18 , a detection is defined as SNR > 2 in the integrated aperture flux measurements. According to this criterion, 10 out of 27 objects in the sample are detected. Five detections have M * > 10 10 M and 12 + log(O/H) B18 > 8.5 19 . Three have M * > 10 10 M and 12 + log(O/H) B18 < 8.5, and the other two have M * < 10 10 M and 12 + log(O/H) B18 < 8.5. The fluxes and their errors are listed in Table 1 Despite the detections for 40% of the sample, for this work we rely on image stacking for two main reasons: 1) the galaxies are not individually detected in Herschel (and in some cases in Spitzer) images, and hence stacking is required to extract the Herschel and Spitzer average fluxes, and 2) the IR SEDs of the detected objects may be biased relative to the underlying population, while stacking better represents the average behavior of the population. We start with the primary beam-corrected tapered images in Jy/pixel units with the sources at the center of subimages according to their optical coordinates. The optical coordinates are used for stacking, given that the targets are relatively faint and are unresolved in ALMA. The stacks are constructed by taking the average flux values in each pixel. We measure stack fluxes using aperture photometry with aperture radius of 1.45 , as described in the previous section. The background and noise are measured in non-primary beam-corrected image stacks, in a similar manner as for individual galaxies. The emission of individual objects is corrected for different redshifts prior to stacking (K-correction). Following Scoville et al. (2016, equation 6) , we use the spectral slope of the Planck function with a temperature of 35 K, and calculate the correction factor at the redshift of each galaxy relative to the average redshift of the sample. Assuming a different temperature does not change the results significantly, as the correction factors are 5% owing to the narrow redshift range of this sample. We use Spitzer MIPS 24 µm (Sanders et al. 2007 ) and Herschel PACS 100 and 160 µm and SPIRE 250 and 350 µm images (Lutz et al. 2011; Oliver et al. 2012) in this work to construct IR stacks, as the majority of our galaxies are not individually detected in any of these bands. Stacks are constructed following standard methods (e.g., Zheng et al. 2006; Reddy et al. 2010; Shivaei et al. 2017) , as explained below. The signal to noise in these datasets is background limited and the dominant source of noise is source confusion. Our stacking approach is designed to extract accurate values using proven approaches to minimize this noise source, as described below. All images are converted to Jy/pixel. For each target, a subimage with the target in the center is constructed. The avoidance of regions near bright sources can significantly reduce confusion noise (Leiton et al. 2015) . Therefore, the images at 24 and 100 µm are inspected to ensure there are no bright sources near the target galaxy that could add noise even after nominal removal (no cases were found). The subimage is then cleaned of neighboring sources as follows, to provide an optimum smooth background for measuring the stacked flux. Prior lists for MIPS 24 µm sources in the field are generated based on galaxies that are detected at SNR > 3 in the IRAC channels 1 and 2. For Herschel PACS images, we use a list of priors with SNR > 3 in the MIPS 24 µm data. Magnelli et al. (2013) show that at this depth (∼ 20 − 25 µJy), virtually full identification of the PACS sources will be achieved. For the SPIRE data, we tested priors at SNR of > 3 and > 5, and settled on the latter value because the density of 3-σ priors made the removal ambiguous, given the large beams in these bands and the probability of more than a single prior within a beam area. The neighboring sources in each subimage are removed by simultaneously fitting scaled PSFs to all the prior sources and to the target, and removing all but the target (which is usually not detected). These cleaned subimages are aligned and then used to construct 3-σ trimmed average stacks 20 . Stack fluxes are derived by summing the pixel values of the stack image within an aperture. Appropriate aperture corrections are calculated for the MIPS photometry based on the 24 µm PSF, and the aperture corrections for the SPIRE and PACS photometry are adopted from the Herschel Legacy Archive 21 and Balog et al. (2014) , respectively. To determine the uncertainty in the stacked fluxes, we measure the flux in 1000 apertures that are randomly positioned in the cleaned subimages away from the center of the image (where the source is) by more than 1 FWHM of the image PSF. The apertures have the same radii as the source aperture radius. The standard deviation of these 1000 fluxes is taken as the stacked flux uncertainty. As this uncertainty is evaluated on the region around the source, it should be a measure of the net uncertainty including confusion noise. We now compare these results with standard estimates of confusion noise. A stacking approach virtually identical to ours has been simulated at 24 µm by Zheng et al. (2006) . That is, they first removed all individually detected sources in the surrounding area for each target, and then stacked the cleaned images. They measured the rms background fluctuations by placing apertures at random places on the stacked images. The rms fluctuations decreased accurately in proportion to the square root of the number of stacked images. They found that 12 images gave a 5-σ detection at 24 µJy; scaling to our 21 images for the low metallicity sample predicts a 7-σ result, and we found a signal to noise of 8 at this flux level, in excellent agreement. At the longer, Herschel, wavelengths our method uses the galaxy stack positions as priors; this is important because use of priors can reduce confusion effects by factors of 2 − 3 (e.g., Rodighiero et al. 2006; Magnelli et al. 2013; Safarzadeh et al. 2015) . For the PACS data, we take confusion noise measurements from the deep survey described by Berta et al. (2011) , who quote 1-σ values of 270 and 920 µJy respectively at 100 and 160 µm using 24 µm sources as priors. Magnelli et al. (2013) quote somewhat lower confusion noise limits, but we attribute this difference to the significantly longer integration than used by Berta or us and the resulting modest improvement in PSF fitting in crowded fields. Taking scaling as found by Zheng et al. (2006) Columns left to right: 3D-HST v4 catalog ID, RA (deg, J2000.0), Dec (deg, J2000.0), ALMA 1.2 mm flux density, its error, Spitzer 24 µm flux density, and its error. Galaxies below and above the horizontal line are below and above 12 + log(O/H)=8.60 (in the B18 calibration), respectively. The Herschel data for the majority of sources are not individually detected, therefore we refrain from listing the individual Herschel measurements. with the square root of the number of stacked images, the noise we estimate is consistent for both the high and low metallicity samples (although slightly higher than the predicted confusion limit). For the SPIRE data, we start with the confusion noise values from Smith et al. (2017) ; their values for the nebulized beam should be relevant for our situation. We assume that these values scale inversely with the square root of the number of images stacked, as in Zheng et al. (2006) . As these values were determined without priors, there should be some further reduction in our case as discussed above. To determine this gain in our case, we compare the noise predicted from confusion with that from our multi-position measurements. We find that the case of using the 5-σ 24 µm priors gives a consistent comparison if we assume an improvement in depth from the use of priors of a factor of 1.4. An exception is for the high metallicity sample at 350 µm, where the measured noise is about half of the prediction. In any case, the predicted confusion levels are large enough for this band that it has little utility in constraining fits. We therefore have discarded both the high and low metallicity stacked results at 350 µm. To verify the accuracy of the aperture flux measurements on the stacked image of the target galaxies, we introduce fake sources following a similar methodology to that described in Reddy et al. (2012a) . We simulate 100 PACS sources with fluxes randomly drawn from a Gaussian distribution centered at the flux values of the real stacks (the width of the distribution is irrelevant for this test). Then, simulated sources in the same number as those that went into the real stack bins are randomly selected for stacking and aperture flux measurement. Based on this input and recovered fluxes, we find that a 3-σ trimmed average stack is a more accurate estimator compared to a median stack. The recovered stack fluxes are ∼ 10% lower than the average input. This correction factor is applied to our stack fluxes. However, the correction is negligible compared to the measurement error. A final source of uncertainties is the absolute calibration of the various datasets, but these errors are negligible and do not affect our results. For example, the 24 µm absolute calibration of MIPS is accurate to ∼ 2% or better (Rieke et al. 2008 ). The Herschel PACS and SPIRE absolute calibrations are accurate to 4−5% (Bendo et al. 2013; Balog et al. 2014; Müller et al. 2014; Bertincourt et al. 2016) . The nominal uncertainty in the absolute calibration of the ALMA observations is 5% (Remjian et al. 2019 ), but it is likely to be twice as accurate (Farren et al. 2021 ) particularly given the good conditions of our observations. The MOSFIRE spectral reduction and line flux extraction are fully described in Kriek et al. (2015) , Reddy et al. (2015) , and Freeman et al. (2019) . In summary, emission line fluxes are measured from the MOSFIRE 1D spectra by fitting Gaussian functions on top of a linear continuum. The uncertainties are derived by perturbing the spectrum of each object according to its error spectrum and measuring the standard deviation of the new realizations. Slit-loss (also called path-loss) corrections are applied by normalizing the spectrum of a "slit star" placed on each MOSFIRE observing mask to match the 3D-HST total photometric flux. Additionally, the HST F160W images of the resolved targets were used to estimate and correct for the additional flux lost outside of the slit aperture, relative to the slit star. For details refer to Kriek et al. (2015) . SFRs are estimated from Hα luminosities, corrected for dust attenuation using Balmer decrements and the Cardelli et al. (1989) MW extinction curve. The success of this approach to recover total SFRs for an Hα-Hβ detected sample is demonstrated in Shivaei et al. (2016) , where Hα SFR estimates corrected for reddening via the Balmer decrement are compared with IR-based SFR estimates. The assumption of a MW extinction curve to correct the observed nebular emission is supported by previous studies that showed the nebular attenuation curve is similar to the MW extinction curve Rezaee et al. 2021) . Following recent studies that have discussed the metallicity dependence of the Hα luminosity to SFR conversion (e.g., Reddy et al. 2018b; Theios et al. 2019) , we adopt two different conversion factors for galaxies with oxygen abundances at roughly the solar value (12 + log(O/H)> 8.6 22 ) and those with lower 12 + log(O/H). We assume the stellar and ISM gas metallicity are linearly correlated with each other. For galaxies with 12 + log(O/H)> 8.6 we adopt the conversion factor of Hao et al. (2011) converted to a (Chabrier 2003 ) IMF 23 : log(SFR/M yr −1 ) = log(L(Hα)/erg s −1 ) − 41.26, (1) and for lower metallicity galaxies, we adopt the conversions derived from the Bruzual & Charlot (2003) stellar population models with constant star formation over 100 Myr and Z = 0.004 from Reddy et al. (2018b) and Theios et al. (2019) , once adjusted to the same IMF that is assumed in Equation (1) For reference, the Kennicutt (1998) and Kennicutt & Evans (2012) The metallicity of the gas is defined as 12 + log(O/H). It is common to estimate O/H using ratios of strong optical emission lines and adopting calibrations that have been constructed based on the observations of electron-temperature-sensitive lines (the so called "direct" method) in local galaxies and Hii regions (such as those of Pettini & Pagel 2004) . At high redshifts, due to the evolving conditions of the ionized gas, the locally-calibrated relations may yield biases in the absolute values of the oxygen abundances (e.g., Kewley et al. 2013; Steidel et al. 2014; Shapley et al. 2015; Sanders et al. 2016a; Strom et al. 2018; Kashino et al. 2019) . The degree of this bias will not be well known until large and representative samples of high-redshift galaxies with temperature-sensitive auroral lines are constructed to calibrate the strong emission lines for oxygen abundances. Bian et al. (2018) used a sample of local analogs of z ∼ 2 galaxies to derive empirical direct-method calibrations between 12 + log(O/H) and strong optical line ratios. At the low-metallicity end, their calibrations for the oxygen and hydrogen line ratios agree with the median of 18 z ∼ 1.5 − 3.5 galaxies with direct-method metallicities from Sanders et al. (2020) . Unfortunately, the [Nii] line was not available for the z ∼ 1.5−3.5 directmethod sample in Sanders et al. (2020) Maiolino et al. (2008) and Curti et al. (2017) and the z ∼ 0 relations of Sanders et al. (2021) at 12 + log(O/H)> 8.0. In this paper, we use O3N2 line ratios mainly for the practical reason that the four lines are available with high SNR for all of our targets. O3N2 strongly correlates with O/H inferred from strong line ratios involving only oxygen (such as R 23 , which is the ratio of ([Oiii]λλ4959, 5007+[Oii]λλ3726, 3729) to Hβ), and with stellar mass at z ∼ 2 (e.g., Sanders et al. 2018; Strom et al. 2018) . O/H inferred from O3N2 is less biased by N/O variations than when using N2, because the O3N2 spans a wider dynamic range in the line ratios. Additionally, all current calibrations for O3N2 show a nearly linear anti-correlation between the strong line ratios and O/H without any turnovers or plateaus at 12 + log(O/H)> 8.0 (Pettini & Pagel 2004; Maiolino et al. 2008; Curti et al. 2017; Bian et al. 2018; Sanders et al. 2021) . Following the methodology of Sanders et al. (2021) , in the small subset (6) Bian et al. (2018) calibrations. These metallicities agree well with the O3N2-derived metallicities. For reference, the O3N2 metallicities in our sample are also tightly correlated with the N2 metallicities, but are on average ∼ 0.05 − 0.10 dex lower. Given the aforementioned uncertainties in estimating O/H at high redshifts, one should take caution in comparing metallicities derived using different line ratios and calibrations from different studies. In Figure 1 , we show the metallicities derived from both the Bian et al. (2018, hereafter B18) and Pettini & Pagel (2004, hereafter PP04) O3N2 calibrations. g Total IR luminosity of the subsolar and solar metallicity bins are based on the best-fit local low-metallicity template ( Figure 4 ) and best-fit R09 LIRG template (Figure 3 ), respectively. h Obscured SFR estimated from total IR luminosity and calibrations of Kennicutt & Evans (2012) . These calibrations provide a linear relationship between O3N2 and oxygen abundance, and therefore the ordering of galaxies in oxygen abundance is preserved regardless of which calibration is used. However, the choice of calibration can affect comparisons from one sample to another. As a reference point, the metallicity of 12 + log(O/H)= 8.35 from the PP04 O3N2 calibration corresponds to 12 + log(O/H)= 8.51 in the B18 O3N2 calibration. For the ease of comparing with other studies, we refer to the metallicities (12 + log(O/H)) from both calibrations in the rest of this paper. Stellar masses are derived from SED fitting to the photometry of the 3D-HST survey (Skelton et al. 2014 ). The photometry is corrected for the nebular emission lines from the MOSFIRE spectra (details in Sanders et al. 2021) . We use the FAST SED fitting code (Kriek et al. 2009) with the stellar population model library of Conroy et al. (2009) for a solar stellar metallicity, a Chabrier IMF, delayed exponentially declining star formation history, and the Calzetti et al. (2000) attenuation curve. The stellar masses are insensitive to the choice of the attenuation curve, as masses are dominated by the older stellar populations that emit in longer near-IR wavelengths where different dust attenuation curves are very similar and the total amount of attenuation (A λ ) is rel-atively small 25 . In this section, we discuss our construction of the metallicity subsamples (Section 3.1), and the adopted IR templates and SED fitting procedure (Sections 3.2). By design, this study is based on lower-luminosity, and hence fainter, galaxies than have been reached by most FIR/submm-selected samples at similar redshifts. To construct subsamples that contain enough galaxies to reliably yield a detection in the stacked images of Spitzer and Herschel data, we consider two metallicity bins (instead of the three mass-metallicity bins that were initially used in the design of the ALMA survey). The bins are selected to represent metallicities of ∼ solar and subsolar with average metallicities of 12 + log(O/H)= 8.71 and 8.41 on the B18 scale or 8.51 and 8.27 on the PP04 scale, respectively. In these two bins, our stacked images yield detections at 24 µm and 1.2 mm at high SNRs and > 2σ detections in one or more additional IR bands. We 25 The choice of attenuation curve in SED fitting can significantly affect the inferred SFRs, as the main difference between various attenuation curves is the curve slope in the UV, where the emission from recent star formation peaks. Studies have shown that low mass/low metallicity galaxies at z ∼ 2 have a steep SMC like curve while massive/metal rich galaxies have a shallower Calzetti-type curve (Reddy et al. 2018a; Shivaei et al. 2020a,b) . Therefore, in this work we do not use the UV/SED-inferred SFRs. The SFRs are derived from Hα luminosity (Section 2.4.1); hence, the choice of the stellar attenuation curve is irrelevant. provide details of the metallicity subsamples and SED fitting in this section. We first perform a jackknife resampling test to evaluate potential biases caused by outliers. In each metallicity subsample, we estimate stacked 24-1200 µm fluxes by systematically leaving out one object at a time and compare the stacks with those of the full sample (stacking technique described in Sections 2.2.2 and 2.3). We find that the resampled stacks in the subsolar-metallicity bin are all consistent within 1σ (σ being the uncertainty of the stack flux) with the stacks of the full subsolarmetallicity bin. The solar-metallicity bin has a smaller sample size, but except for one object, there is no systematic bias at > 1σ level for more than one photometric band in the resampled stacks. The exception is Object 8280 (Table 1) , whose IR emission behavior is significantly different from the rest of the solar-metallicity sample. This galaxy shows an elevated 24-160 µm emission compared to its 250-1200 µm, indicating the presence of warm dust, which can be due to a recent starburst or buried AGN. Object 8280 has a metallicity of 12 + log(O/H)= 8.66 +0.05 −0.08 and its estimated age from SED fitting is ∼ 50 − 100 Myr depending on the starformation history assumptions. Additionally, its SED inferred SFR (determined through the rest-frame UV continuum emission) is more than a factor of 2 larger than its Hα-estimated SFR. The young age and the elevated UV to Hα SFR both strengthen the argument that this galaxy has recently undergone a strong starburst. As this peculiar galaxy can bias the results of the stacks, we remove it from the rest of the stacking analysis. The properties of the final subsolar and solar metallicity subsamples are listed in Table 2 . Rieke et al. (2009) with IR luminosities of log(L(IR)/L ) = 11.0, 11.25, and 11.5, and the average local lowmetallicity dwarf template (Section 3.2) are shown. The reduced χ 2 values (with four degrees of freedom) are also indicated. The observations follow the behavior of the average local low-metallicity template most closely, indicating a broader and warmer IR SED than predicted by the local LIRG templates. We consider IR templates from the literature that are commonly used. We fit the models by weighted least squares, where the weights are the inverse of the variances of the measurements. As two examples, the Rieke et al. (2009, R09) and the Schreiber et al. (2018, S18) template fits to the solar-metallicity sample are shown in Figure 3 26 . The only free parameter for each template is the normalization. Both of the R09 and S18 templates provide reasonably good fits to the measured IR SEDs for 2 < z < 4 galaxies that are sufficiently luminous to have good detections in multiple FIR/submm bands (e.g., Schreiber et al. 2018; De Rossi et al. 2018) . Similarly good fits are also provided by the Magdis et al. (2012) z ∼ 2 template. In agreement with previous studies on massive (and high metallicity) galaxies at z ∼ 2 (e.g., Elbaz et al. 2011; Reddy et al. 2012b; De Rossi et al. 2018; Shivaei et al. 2018) , the photometry of the solar-metallicity galaxies agrees well with the local LIRG templates. Owing to the small sample size of the solar metallicity bin (N = 5), we do not aim to draw conclusions based on the insignificant χ 2 value differences among different fits. That is, we can take the R09 sets of templates as proxies for other fits to the IR SEDs of solarmetallicity galaxies at z ∼ 2. We now will use the stack measurements from Section 2 to test the FIR behavior of sub-solar metallicity galaxies at z ∼ 2.3, and whether it 26 The R09 templates are adopted in the luminosity range recommended by Rujopakarn et al. (2013) , which selects the set of R09 models that most closely matches the shape of the IR SED of galaxies at z > 1. These models should be applicable up to a total IR luminosity of ∼ 4 × 10 12 L (Shipley et al. 2016 ), but the lower luminosity limit of applicability is not determined. We use the S18 IR template with temperature of 35 K and PAH fraction of 3%, as suggested by Schreiber et al. (2018) for z ∼ 2 galaxies with stellar masses of ∼ 10 10.0−11.5 M . exhibits the relatively broad FIR SED seen for local lowmetallicity galaxies (Rémy-Ruyer et al. 2015; Lyu et al. 2016) . As a starting point in the interpretation of our low metallicity measurements, we construct a publicly available 27 IR SED template to represent the average behavior of local low-metallicity galaxies in a quantitative way, as follows. We take the photometry for the template from the comprehensive and homogeneous results in Rémy-Ruyer et al. (2015) . The FIR characteristics of low-metallicity galaxies depend on luminosity, such that above log (L(IR)/L ) ∼ 9.5, the average SED is significantly warmer than below this value (Rémy-Ruyer et al. 2015) . For this reason, we use only the galaxies at and above this threshold for the average template. This results in 11 galaxies. Before averaging their photometry for the FIR SED, we exclude three for the following reasons. Examination of the PACS 70 µm image for the local galaxy UM 311 shows that, within the 115 aperture used to extract photometry, the signal would have been dominated by the disk of a nearby galaxy, NGC 450 (it has even been suggested that UM 311 is an Hii region in this galaxy), which artificially raises its stated luminosity. Moreover, the local galaxy HS0052+2536 is much fainter than the rest of the sample, and its SPIRE measurements have too low SNRs to be useful. Additionally, the IRS spectrum of the local galaxy Mrk 930 is too noisy and the slope of the spectrum conflicts with the photometry. We take mid-IR spectra for all 9 (excluding UM 311 and Mrk 930) from Lebouteiller et al. (2011) through the NASA/IPAC Infrared Science Archive (IRSA). We average the measurements (photometry and spectra) in logarithmic space and fit the FIR photometry with polynomial series. We impose a small power law slope (linear in log space) to make the synthetic photometry on the averaged spectrum match the photometry in IRAC Band 4, while achieving a smooth join to the fit to the FIR photometry at 20 µm. Figure 4 shows the fits to the sub-solar metallicity stacks (N = 21) to the average local low-metallicity template, as well as three of the R09 LIRG templates in the range of log(L(IR)/L ) = 11.0 − 11.5, as recommended for high redshift galaxies (De Rossi et al. 2018) . None of the LIRG templates provide a good fit to the stacked photometry from 24 to 1200 µm, as they either underestimate the 100 µm emission or overestimate the 150-250 µm emission significantly, showing that the templates are not broad enough to represent the sub-solar metallicity stacks. We also fit other commonly used templates to the stacks to investigate their goodness of the fit parameter (reduced χ 2 ): the template of S18 with T = 35 K and 3% PAH fraction (χ 2 red = 2.8), the Chary & Elbaz (2001) (2002) templates (χ 2 red = 11.2). None of these templates provide reasonable fits to the data and they are rejected at > 98% confidence levels. In comparison, the average local low-metallicity tem-27 http://www.ireneshivaei.com/shivaei22.html plate better represents the behavior of the z ∼ 2.3 subsolar metallicity stacks with a broader and warmer FIR emission (solid curve in Figure 4 ). It also shows a lower reduced χ 2 of 1.8, compared to the other templates mentioned above. The metallicities of the 9 dwarf galaxies that are used to build the average local low-metallicity template are in the range of 12+log(O/H)= 8.1−8.4 with an average of 8.3 and standard deviation of 0.09 dex. On average, the z ∼ 2.3 sample has a 0.1 dex higher metallicity (on the B18 calibration scale; Table 2 ) with half of the sample having metallicities larger than 8.4, which is the highest metallicity in the dwarf sample. However, the difference between the two strong-line metallicity calibrations adopted here for the high-redshift galaxies, the PP04 and B18, introduces a systematic uncertainty of ∼ 0.1 − 0.2 dex (Section 2.4.2). In any case, the emergence of a warm dust component makes the IR SED of z ∼ 2.3 subsolar metallicity galaxies more similar to that of the local low-metallicity dwarfs than the local LIRGs. We discuss the physical interpretation of this result further in Section 4.2. In the preceding section, we found that the z ∼ 2.3 sample with subsolar metallicity is fitted better by a template based on local low-metallcity galaxies than by templates based on local galaxies of ∼ solar metallicity. We now discuss the underlying behavior of this change in the IR SED shape. To quantify the difference in the warm dust between the subsolar and solar metallicity bins, we use the simple 2T-MBB model of Kirkpatrick et al. (2015, K15) to fit the Herschel and ALMA stacks only. In brief, the model consists of two modified blackbodies with two different temperatures, which we designate as warm and cold dust components (more details in Appendix A.2). The goal here is to investigate how the warm dust temperature and intensity (luminosity) changes between the subsolar and solar metallicity samples. Therefore, due to the lack of sufficient observational constraints on the cold dust component, we fix the cold dust temperature and β and evaluate the change in the warm dust component temperature (T w ) and its peak flux compared to that of the cold component. The cold dust temperature is set to 25 K and the warm component temperature is a free parameter between 30 − 150 K, motivated by the findings of Rémy-Ruyer et al. (2015), who showed that the cold dust temperature is fairly constant at T ∼ 25 K among local galaxies with different metallicities and that the warm dust component can be as high as 150 K in a number of low-metallicity galaxies (see the discussion in Appendix B). We adopt a β of 1.5 for the sub-solar metallicity samples, which is the average submm emissivity index of the local dwarf galaxies (Lyu et al. 2016) . For ease of comparison, we also adopt β = 1.5 for the solar-metallicity model. The assumption of β = 2 does not change any of the main results of the solar-metallicity model. The fits are shown in the top panel of Figure 5 , in which the 24 µm data are not included in the fitting procedure. For the solar-metallicity model, we fit the 2T-MBB function to the model photometry at 100, 160, 250, f w peak /f c peak = 2.0, T w = 68±15 K, T c = 25K, = 1.5 Figure 5 . The two-temperature modified blackbody (2T-MBB) IR fits, as described in Section A.2, to the stacks of 100, 160, 250, and 1200 µm data. First row shows the stacks of the subsolar-metallicity sample in the left panel, and the modeled solar-metallicity photometry based on the best-fit S18 template in Figure 3 in the right panel. Middle and bottom rows show the stacks in two bins of sSFR and SFR surface density (Σ SFR ) respectively. The light grey 24 µm stacked measurement is not included in the fits. The warm and cold components of the fits are also shown separately with the dotted orange and blue curves, respectively. Parameters of the warm and cold dust components are shown in the bottom of the plots: f w peak /f c peak is the warm to cold component peak flux ratio, and Tw and Tc are the warm and cold dust component temperatures, respectively. The average metallicity, redshift, and other relevant properties of the samples are shown in the top-left corners. The most significant change in the width of the IR SED is between the two metallicity bins. and 1200 µm extracted from the best-fit S18 template ( Figure 3 ). The S18 template is also shown in grey in the solar-metallicity panel of Figure 5 . For the sub-solar metallicity model, we fit the 2T-MBB function directly to the subsolar-metallicity stack fluxes at 100, 160, 250, and 1200 µm. The difference between the two metallicity samples is very clear. While the warm and cold dust components have different temperatures in both samples, the two components are easily distinguishable in the subsolar-metallicity bin. This is reflected in the temperature difference between the warm and cold components, as well as their peak flux ratios. The 2T-MBB fit to the synthesized solar-metallicity photometry matches the original IR template very well. The average of the warm (50 K) and cold (25 K) components is the same as the S18 template temperature of 35 K and the warm component profile dominates the IR peak width with a warm-to-cold peak flux ratio of 3. In comparison, the subsolar-metallicity 2T-MBB fit indicates a 108 ± 33 K temperature difference between the cold and warm components, and comparable peak fluxes. 28 Based on these fits we conclude that a) the temperature of the warm dust component increases with decreasing metallicity, and b) while the solar-metallicity IR SED width can be mainly represented by a single MBB, the subsolar-metallicity IR SED is broader as the difference in the temperatures of the two components is larger with similar peak fluxes (i.e., neither of the components dominates in terms of brightness). The solar-metallicity fit has ∆T = 25 ± 3 K, corresponding to ∆λ = 213 +14 −12 µm, while the subsolarmetallicity fit shows ∆T = 108 ± 33 K, corresponding to ∆λ = 347 +26 −16 µm, a factor of 1.6 wider. This behavior is similar to that found for local galaxies of subsolar and solar metallicity, e.g., Rémy-Ruyer et al. (2015) , and is discussed further below. A hotter warm component at low metallicities has also been observed both in the z ∼ 0.3 rest-frame UV analogs of z > 5 galaxies (Faisst et al. 2017) , and in the local dwarf galaxies of the Dwarf Galaxy Survey (DGS, Rémy-Ruyer et al. 2015) . The three low-redshift analogs of z > 5 galaxies in Faisst et al. (2017) with high sSFR and low metallicities similar to our subsolar metallicity sample, show luminosity-weighted temperatures of ∼ 70 − 90 K. Faisst et al. (2017) explained the presence of a hot dust component by possibly an optically thin ISM to UV radiation due to the low metallicities, as well as a strong UV radiation field due to high star formation densities. In Rémy-Ruyer et al. (2015), 11 DGS galaxies show an IR excess at λ rest ∼ 20 − 30 µm. These 11 galaxies have on average 0.25 dex lower 12 + log(O/H) compared to the average oxygen abundance of the rest 28 We caution that the exact value of the warm dust temperature depends on the choice of β and Tc. Therefore, the best-fit Tw values in this section should not be taken literally and are mainly for relative comparisons. However, altering the values of the cold dust temperature and β within reasonable ranges does not affect our main conclusions. For example, a Tc = 30 K for the subsolarmetallicity fit returns a Tw = 150 ± 61 K and f w peak /f c peak = 0.9, and for the solar-metallicity model returns a Tw = 53 ± 2 K and f w peak /f c peak = 1.3. As another example, keeping Tc = 25 K but β = 2, the subsolar (solar) metallicity fit results in Tw = 135 ± 10 K (49 ± 2) and f w peak /f c peak = 0.8 (1.2). Therefore, the overall conclusion that the warm component is hotter in the subsolarmetallicity fit does not change by varying Tc or β. of the sample. Rémy-Ruyer et al. (2015) derived dust temperatures in the range of 100 − 150 K (average of T w = 117 K) for the warm component, and average of T c = 31 K for the cold component for those 11 galaxies, similar to the values found for the z ∼ 2.3 subsolarmetallicity sample in this work. Rémy-Ruyer et al. (2015) attributed the warm component to a higher contribution from the hot Hii regions heating the dust grains in dwarf galaxies, owing to the smaller physical sizes and lower dust attenuation of the local dwarf galaxies compared to local L * galaxies. This effect can also be explained by the higher sSFR of the dwarf galaxies in their sample, producing a wider equilibrium temperature distribution of dust grains, skewed towards higher dust temperatures (hence a hotter and wider IR SED). However, the sSFR distribution of our subsolar and solar metallicity samples are statistically indistinguishable (referring to the sSFR average and dispersion of the two subsamples in Table 2 ), therefore a change in sSFR alone cannot explain the change in the IR SED of our two metallicity samples. Below, we explore possible causes for the elevated warm dust emission in the subsolar metallicity sample in detail. Dust Emission Dust evolution is the combination of grain formation, processing (e.g., grain size modification, structural modification, coagulation), and destruction that can be affected by the incident non-ionizing UV and ionizing radiation, cosmic rays, stellar ejecta, SNe shocks, and ISM elemental enrichment. As a result, the shape of the IR SED can be altered by both the grain properties (size and composition) and the grains' heating sources. The cold dust component resides in the diffuse ISM, is dominated by the emission from large grains, and constitutes most of the dust mass. On the other hand, the warm dust resides closer to star-forming regions or AGN. The hotter and broader IR emission seen in the subsolar-metallicity galaxies in this work may be the result of an overabundance of small dust grains (e.g., Galliano et al. 2018; Ysard et al. 2019) , an intense interstellar radiation field (e.g., Dale et al. 2001; Draine & Li 2007; Galliano et al. 2011; Faisst et al. 2017) , and/or an overabundance of silicate grains (De Rossi et al. 2018 ). These possibilities can be probed by examining the metallicity, sSFR, SFR surface density (Σ SFR ), the age of galaxies, and the AGN activity. As the galaxies in our sample show no evidence of obscured AGNs based on the examination of the IRAC colors (Coil et al. 2015; Azadi et al. 2017 Azadi et al. , 2018 Leung et al. 2019) , we can assume the AGN contribution is negligible in the mid-to far-IR (for a discussion on AGN dust emission see e.g., Kirkpatrick et al. 2012 Kirkpatrick et al. , 2015 Lyu & Rieke 2017 . Below we review the other factors (metallicity, sSFR, Σ SFR , and age) that may be responsible for the broad IR SED shown in Section 3. Metallicity -Metallicity is a tracer of dust processing, owing to grain growth by accretion of gas-phase metals (e.g., Hirashita 2015) . However, the effect of elemental enrichment of the ISM on dust evolution becomes evident over long timescales of ∼ 1 Gyr (Galliano et al. 2018) . Metallicity also traces the dust-to-gas ratio of a galaxy ( At low metallicities, the lower dust-to-gas ratio means the ISM is less dusty and more transparent, enabling the stellar radiation to heat a larger volume and deeper into the molecular cloud. Moreover, if the stellar and nebular metallicities correlate with each other (Cid Fernandes et al. 2005; Gallazzi et al. 2005; Bresolin et al. 2009; Toribio San Cipriano et al. 2017 ), a galaxy with a lower gas-phase metallicity may also have, on average, lower-metallicity stars that emit a harder ionizing spectrum. As a result, the harder and more intense radiation that affects larger volumes of the ISM can contribute to the hotter dust emission in low metallicity galaxies. This effect is in addition to the lower abundances and hence lower dust attenuation that enables heating the dust deeper into molecular clouds, or the higher abundance of smaller (hotter) grains at low metallicities 29 . As demonstrated in Section 3.2, the stack photometry of subsolar metallicity galaxies at z ∼ 2.3 follows very closely the average local low-metallicity template that we construct based on the IR photometry and spectra of higher luminosity dwarf galaxies. This average template has a broader and warmer FIR emission compared to local solar-metallicity LIRGs. Half of the z ∼ 2.3 subsolarmetallicity sample has oxygen abundances higher than the upper limit of the dwarf sample used to build the local low-metallicity template (12 + log(O/H)= 8.4), with an average of 0.1 dex higher oxygen abundance in the z ∼ 2.3 sample compared to the dwarf sample. If this difference is real, as the high-redshift oxygen abundances are subject to the assumed strong-line metallicity calibration (see discussion in Section 2.4.2), it indicates that galaxies at z ∼ 2.3 can have similar ionization field properties as local galaxies of lower metallicity. In fact, the rest-frame UV-optical studies have shown that the O/Fe ratio of z ∼ 2 galaxies is ∼ 0.5 − 0.6 dex enhanced relative to the solar abundance (Steidel et al. 2016; Topping et al. 2020b,a; Cullen et al. 2021; Reddy et al. 2021) . In other words, the O/H abundances of z ∼ 2 galaxies are higher than the O/H of z ∼ 0 galaxies at a given Fe/H. It is thus plausible that the Fe/H that controls the ionizing spectral production is similar between the z ∼ 0 dwarf galaxies and the z ∼ 2.3 galaxies in this study, despite their different O/H values. As a result, a similar ISM ionizing radiation field intensity and hardness causes a similar IR SED shape between the z ∼ 0 dwarfs and the z ∼ 2.3 subsolar-metallicity LIRGs. It is also possible that the warmer IR SED of lower metallicity galaxies originates from a geometry effect, such that the low-metallicity galaxies have a more clumpy ISM where dust is spatially concentrated and heated to higher temperatures. Tentative evidence for a clumpy dust geometry at low metallicities at z ∼ 2 has been discussed in Shivaei et al. (2020b) . In that study, the authors compared the nebular and stellar dust reddening (E(B − V )) and concluded that on average at low metallicities the two reddenings are not the same, which suggests a clumpy two-component dust geometry. We also discuss this possibility and its implications in Section 5.4.1. sSFR -sSFR indicates the ratio of recent SFR to the SFR averaged during the lifetime of the galaxy. In studies of nearby galaxies it has been shown that the temperature of the warm dust correlates with both sSFR and metallicity (Galliano et al. 2005; Boselli et al. 2010; Smith et al. 2012; Rémy-Ruyer et al. 2013 , 2015 , and that at high SFRs it is driven by recently-born young massive stars (Boquien et al. 2011) . Rémy-Ruyer et al. 2015 attributed the warmer and wider SED of local dwarfs compared to the SED of local starbursts to differences in their sSFR. A high sSFR, indicates the galaxy is undergoing an active phase of star formation and is likely to have a clumpier ISM structure as it has a larger number of massive hot stars embedded in their dusty birthclouds (Dale et al. 2007; da Cunha et al. 2008 ). The clumpier ISM allows for a wider range of interstellar radiation field intensities, leading to a wider equilibrium temperature distribution of the dust grains (hence, broader IR SED), and the hotter regions shift the temperature to higher values (da Cunha et al. 2008; Rémy-Ruyer et al. 2015) . sSFR also correlates with SFR surface density (Elbaz et al. 2011) , which is a proxy for an intense radiation field, and hence a hotter dust temperature (see below). Moreover, a relatively higher supernovae rate in actively star forming galaxies allows for an increase in the population of small dust grains through shattering of large grains by grain-grain collisions in supernovae shock waves (Jones et al. 1996) . The sSFR distributions of the two metallicity samples are similar to each other (with average log(sSFR/M yr −1 ) = −8.43 and −8.53, and standard deviation of σ = 0.21 and 0.21 for the subsolar and solar metallicity samples, respectively). Therefore, to assess the degree of the variation of the IR SED with sSFR, we define two new bins of galaxies below and above log(sSFR/yr −1 ) = −8.45 with 15 and 12 galaxies, respectively, and fit their Herschel and ALMA data with the 2T-MBB models (middle panel of Figure 5 ). The difference in the warm dust temperature of the two sSFR SEDs is similar to that of the two metallicity SEDs (top panel of Figure 5 ). However, the SEDs of both the low and high sSFR stacks are still relatively broad, as the peak fluxes of the warm and cold components in both bins are similar to each other. As a comparison, the warm component in the solar-metallicity SED is 20 times more intense than the cold component, making the overall shape of the solar-metallicity IR SED narrower than that of the subsolar-metallicity one. SFR Surface Density -The equilibrium temperature of dust grains can also be increased by high ISM radiation field intensity. A proxy for radiation field intensity is the SFR surface density, Σ SFR , indicating the compactness of the star forming region. The effect of Σ SFR on dust temperature is shown in previous observational studies (Lehnert & Heckman 1996; Chanial et al. 2007; Burnham et al. 2021) , as well as the theoretical models of De Rossi et al. (2018) . In the latter study, a "bluer" SED is produced by increasing star formation efficiency, which is accompanied by a decrease in the virial radius, and hence an increase in luminosity density. It has also been shown that the luminosity surface density explains the similar mid-IR emission behavior of centrally concentrated local ULIRGs and that of high-redshift LIRGs (Elbaz et al. 2011; Rujopakarn et al. 2011 Rujopakarn et al. , 2013 . We use optical sizes, measured from HST/F160W images (van der Wel et al. 2014) , to calculate Σ SFR . There are recent studies that show the dust emission tend to be much more compact than the stellar emission at these redshifts (Rujopakarn et al. 2019; Popping et al. 2021) . However, as we use sizes to only separate the galaxies into two bins of Σ SFR , the discrepancy between the absolute optical (tracing mass) and IR (tracing star formation) sizes would likely not change any of the following main results. The Σ SFR of galaxies in our sample ( log(Σ SFR /M yr −1 kpc −2 ) = 0.19 and σ = 0.43) are consistent with those of the main-sequence z ∼ 2 galaxies calculated from the parent MOS-DEF sample ( log(Σ SFR /M yr −1 kpc −2 ) = −0.22 and σ = 0.82), but smaller than Σ SFR of local ULIRGs (log(Σ SFR /M yr −1 kpc −2 ) ∼ 1.5 − 2.5, from Tacconi et al. 2013 ). Within our sample, the subsolar-metallicity galaxies have on average a larger Σ SFR than the solar-metallicity galaxies. However, the correlation between Σ SFR and metallicity in the sample is weak (Pearson correlation coefficient of −0.44 with p-value of 0.02) and the standard deviation of Σ SFR within each metallicity subsample is large, making the Σ SFR distributions of the two metallicity bins not significantly distinct (Table 2 ). To better evaluate the effect of Σ SFR on the IR SED in our sample, we fit the 2T-MBB model with T c = 25 K and β = 1.5 to the stacks of MIPS, Herschel, and ALMA data in two bins of Σ SFR divided at log(Σ SFR /M yr −1 kpc −2 ) = 0, with 12 and 15 galaxies in the low and high Σ SFR bins, respectively. The fits are shown in the bottom panel of Figure 5 . As expected, the best-fit model of the high Σ SFR bin shows a warm component that is hotter than the one in the best-fit model of the low Σ SFR stacks. The warm dust temperature difference between the two bins is ∆T w = 23 ± 33, which is less significant than the warm dust temperature difference between the two metallicity bins. Equally important is the difference between the warm-to-cold luminosity ratios. Both the low and high Σ SFR best-fit models have warm and cold components with similar luminosities, indicating a broad IR SED that does not change with Σ SFR significantly. Stellar Population Age -Another parameter that affects dust emission properties is age. De Rossi et al. (2018) showed that a silicate-rich mixture with amorphous carbon dust can explain a hot and broad IR SED that originates from the relatively high emission efficiency of silicates between ∼ 8 − 60 µm. A source of silicate-rich dust is massive AGB stars (with masses above 3.5 M ; Ventura et al. 2012b,a). A 3.5 M star has a main-sequence lifetime of ∼ 400 Myr, suggesting that the silicate-rich dust composition would dominate the IR emission of galaxies younger than this age. We have age estimates for galaxies from the best-fit UV-to-near-IR SED models. However, as ages derived from exponentially rising star formation history models in SED fitting are ambiguous (e.g., see Section 6.2 of Reddy et al. 2012b ), a robust analysis of the effect of age on the IR SED shape is beyond the scope of this paper. Additionally, the luminosity-weighted ages are correlated with metallicity and anti-correlated with sSFR, which makes it difficult to disentangle the age effect from the other two parameters in this analysis. Summary -In conclusion, we find that a warm component is present across our sample, as seen in the wavelength separation of the warm and cold components' peaks in Figure 5 compared to that of the S18 model. Even in the low sSFR and low SFR surface density bins, the two peaks have wavelength separations of ∼ 265 ± 32 µm, which is wider than the wavelength separation between the cold and warm components modeled for the S18 template is 213 +14 −12 µm. This could be due to the high sSFR and Σ SFR of these galaxies compared to the average values for their stellar mass at z ∼ 2 (Figure 1) , which makes this analysis distinct from many other studies at z ∼ 2. For example, at z = 1.8 − 2.5, the S18 lowest mass sample has log(M * /M ) = 10.0 − 10.5, which is consistent with the stellar mass of our solarmetallicity sample (log(M * /M ) ∼ 10.2 − 10.6). The average SFR of the S18 sample based on their IR luminosity measurements and a Kennicutt & Evans (2012) relation (adopted for a Chabrier IMF) is ∼ 20 M /yr. At these masses, it is expected to have > 70% of the SFR in the obscured phase (Whitaker et al. 2017 ). Therefore, the log(sSFR/yr −1 ) of their low-mass sample is ∼ −8.8 to −8.9, which is lower than the sSFR of the majority of our sample with the average and scatter of log(sSFR/yr −1 ) = −8.4 and σ = 0.2 dex. Across our sample, we find that the IR SED gets broader and the temperature of the warm component increases with decreasing metallicity, increasing sSFR, and increasing SFR surface density. The subsolar metallicity, high sSFR, and high SFR surface density samples show wavelength separations between the peaks of their warm and cold components of ∆λ = 347 +26 −16 µm, ∆λ = 304 +6 −5 µm, ∆λ = 317 +55 −27 µm, respectively. These are ∼ 1.5 − 1.6 times higher than the wavelength separation of 213 µm between the cold and warm components modeled for the best-fit S18 template. The broadening effect is the most significant with metallicity, as the two dust components show the largest peak separation and temperature difference (∆λ = 347 +26 −16 µm and ∆T = 108 ± 33 K) at subsolar metallicities. The high SFR surface density and subsolar metallicity samples studied here are useful analogs to higher redshift galaxies, given the expected redshift evolution in size (Mosleh et al. 2012; van der Wel et al. 2014) , SFR (Speagle et al. 2014; Tasca et al. 2015) , and metallicity (Troncoso et al. 2014; Sanders et al. 2021 ) at a fixed stellar mass. As will be discussed in Section 5.2, the commonly-used narrower and colder IR templates that are calibrated based on local LIRGs and ULIRGs, fit to shorter wavelength data alone (λ rest 60 µm), overestimate the RJ emission of high-redshift galaxies. In the case of limited available data, our results suggest that templates with hotter and broader IR SEDs should be considered for typical (i.e., LIRG and lower luminosity) galaxies at z 2, such as the average local low-metallicity template presented in this work. Similar results have been shown previously by De Rossi et al. and Faisst et al. (2020) , who recommended using the hotter and broader IR template of the local lowmetallicity galaxies at z > 4. In the coming decade, the primary means to estimate IR luminosities, obscured SFRs, and dust masses will be JWST and mm/submm facilities such as ALMA and LMT/TolTEC, operating respectively at ∼ 21 µm and ∼ 0.5 − 3 mm, which correspond to the rest-frame PAH emission and FIR/submm dust continuum emission at z ∼ 2.3. Main-sequence typical galaxies similar to our subsolar-metallicity sample will be within reach for either approach and synergies between the observatories is a natural consequence. Since the results of this work indicate a change in the IR SED with metallicity, it is of concern how well either of the observed PAH or submm dust continuum measurements estimate IR luminosities and how well the PAH emission can estimate the submm fluxes and vice versa. The need for an improved prescription to predict the submm flux of high-redshift galaxies is underscored by the lower-than-expected detection rates of high-redshift galaxies in blind ALMA surveys (e.g., compare the predictions in Hatsukade et al. 2016 and Fujimoto et al. 2016 with the detection rate in Aravena et al. 2016 ). In the following subsections, we first describe the comparison samples at z ∼ 0 and 2 (Section 5.1), and then use the results of the previous sections on the evolution of IR SED with metallicity to discuss submm flux predictions and inferred IR luminosity (Section 5.2), obscured SFR (Section 5.3), and inferred dust masses (Section 5.4) of z ∼ 2.3 galaxies. As the main results of this section and the next section (5.4) rely primarily on ALMA photometry, we take advantage of the depth of our ALMA observations and construct stacks in more than two bins of metallicity. In this section, the uncertainties of the stacked photometry are only measurement uncertainties, and do not include the jackknife resampling as was performed in the previous sections. From a visual inspection, we exclude one of the 27 targets (ID 19753) , which is likely a merger or a complex system with multiple star-forming components, from this part of the analysis. The 1.2 mm and UV peak emission of this object are not co-spatial, and while the MOSFIRE slit is centered on the component with the peak UV emission, the IR emission (traced by ALMA) is mainly from another component. Excluding this source does not change the results of the best-fit SED models in the previous sections. Here we describe the comparison samples that are adopted from the literature. At z ∼ 0, we adopt four datasets with Herschel and Spitzer photometric observations that cover a wide range of galaxy populations in the local Universe from dwarfs to ULIRGs, as follows. • The Dwarf Galaxy Survey (DGS; Madden et al. 2013) consists of 48 star-forming dwarf galaxies with metallicities from 12 + log(O/H)= 7.14 to 8.43. The Herschel and Spitzer photometric data and measured galaxy properties (metallicity, SFR, stellar and dust mass) are collected from Madden et al. (2013) and Rémy-Ruyer et al. (2015) . • The Key Insights on Nearby Galaxies: a Far-Infrared Survey with Herschel (KINGFISH; Kennicutt et al. 2011) is a survey of 61 nearby galaxies drawn from the Spitzer Infrared Nearby Galaxies Survey (SINGS), selected to span wide ranges in luminosity, optical-to-IR ratio, and morphology, with metallicities from 12+log(O/H)= 7.54 to 8.77. The metallicities and AGN classification are taken from Kennicutt et al. (2011) . The Spitzer and Herschel photometry of the sample are listed in Dale et al. (2007) and Dale et al. (2012) , respectively. To be consistent with the DGS measurements, we adopt the SFR, stellar mass, and dust mass of KINGFISH galaxies from Rémy-Ruyer et al. (2015). • To complete the sample of nearby galaxies, we also add data from Great Observatories All-sky LIRG Survey ( As the main-sequence comparison sample at z ∼ 2, we adopt data from two surveys, as follows. Neither of the two samples have metallicity measurements. (Skelton et al. 2014 ). Out of the 10 galaxies, only 4 are detected with PACS at 100 µm. There are no metallicity measurements for these galaxies. For consistency with our measurements, we calculate the dust masses from ALMA band-6 fluxes in the same way as done for our sample (Equation 3) • The average SFR, stellar mass, and dust masses of Santini et al. (2014) at z = 2.0 − 2.5 are also adopted. Dust masses in this study are derived from SED fitting to the PACS and SPIRE 100-to-500 µm photometric stacks in bins of stellar mass and SFR. There are no metallicity measurements. It is often the case at high redshifts that limited IR data are available to estimate IR luminosity or to predict flux densities of other parts of the IR SED. Therefore, for simplicity often a single locally-calibrated IR template is assumed and fit to the limited data. Here, we investigate how well the IR luminosities and IR colors of our z ∼ 2.3 LIRG sample match with the commonly used IR templates of local LIRGs. IR luminosity from ALMA -The IR luminosity (8 to 1000 µm) of our subsolar metallicity bin is log(L(IR)/L ) = 11.35, calculated from the best-fit average local low-metallicity template in Figure 4 . The S18 T = 35 and the R09 log(L(IR)/L ) = 11.25 templates fit to the ALMA submm flux density alone underestimate the IR luminosity by 0.1 − 0.2 dex. The colder templates (e.g., the S18 T = 30 K or the R09 log(L(IR)/L ) = 11.00 templates) underestimate the IR luminosity by > 0.4 dex. This is because these templates underestimate the elevated mid-IR emission in these galaxies. The IR luminosity of the low-metallicity z ∼ 2.3 galaxies is generally less biased when the templates are fit to PACS data, or when the broader dwarf templates are used to fit the MIPS data (e.g., see Appendix A in Shivaei et al. 2020a ). The broader templates such as the one we constructed here from the local low-metallicity galaxies are necessary when only limited submm flux observations are available. ALMA flux from IR luminosity -From another point of view, we examine how well the submm fluxes can be predicted based on the IR luminosity of our galaxies. The observed ALMA flux to L(IR) ratio of our subsolar metallicity bin is log(f 1200 /L(IR)/µJy L −1 ) = −9.52 ± 0.05. This observed value is lower than predictions from locally-calibrated IR templates, such as those of R09 and S18. As an example, the R09 log(L(IR)/L ) = 11.25 and 11.50 templates have submm flux-to-L(IR) ratios that are ∼ 0.2 − 0.3 dex larger than our observed value. The higher-than-observed submm flux to L(IR) ratio of the templates is due to the colder dust temperature of the templates at a given IR luminosity compared to the luminosity-weighted average temperatures of the galaxies in this work. The hotter average temperatures can be attributed to the higher sSFRs of the galaxies compared to the main-sequence. Similarly, the ASPECS galaxies that are located above the z ∼ 2 main-sequence relation (Shivaei et al. 2015b) show observed 1.2 mm flux-to-L(IR) ratios 31 that are lower than those predicted by the aforementioned local LIRG templates by > 0.3 dex. 31 Here, the L(IR) of ASPECS sample is estimated from UV-to-IR SED fitting (Boogaard et al. 2019 ). ALMA flux from mid-IR and PAH emission -Another common scenario for high-redshift surveys is that only photometry shortward of the IR emission peak is available (e.g., from Spitzer, Herschel, or future JWST/MIRI) to predict submm fluxes. In this case, we assess how well the rest-frame 30-to-8 µm, 360-to-8 µm, and 360-to-30 µm IR colors of the z ∼ 2.3 galaxies match with those of the z ∼ 0 samples and local LIRG IR templates. The z ∼ 2.3 sample in this work has relatively good constraints on the observed 24, 100, and 1200 µm photometry. These bands correspond to rest-frame 8, 30, and 360 µm, respectively, which for a z ∼ 0 sample are roughly traced by IRAC 8 µm, MIPS 24 µm, and SPIRE 350 µm. In Figure 6 , we show the IR colors based on the aforementioned bands for the z ∼ 2.3 sample and the z ∼ 0 comparison samples (Section 5.1) 32 . Overplotted on Figure 6 are also the IR color predictions from four IR templates: a) the R09 LIRG templates with log(L(IR)/L ) = 11.00, 11.25, 11.50, and 11.75 (dark red crosses with sizes increasing with increasing luminosity). These three models are in the range of recommended templates for galaxies at z > 1 by Rujopakarn et al. (2013) and De Rossi et al. (2018) , b) the S18 templates with T = 35 and 45 K and PAH fraction of 0.01 and 0.1 (pink crosses with sizes increasing with increasing PAH fraction). The T = 35 template is the one that is recommended by Schreiber et al. (2018) to be used for high-redshift galaxies, c) the starburst and mainsequence templates of Elbaz et al. (2011, orange crosses with the small and large ones for the main-sequence and starburst models, respectively), and d) the average local low-metallicity template (magenta plus sign, Section 3). The rest-frame 30-to-8 µm colors of the z ∼ 2.3 galaxies vary by a factor of 5 between the lowest metallicity bin (12 + log(O/H) B18 = 8.33) and the rest, which shows suppressed PAH emission at low metallicities, as expected. The 24 µm SNR of the lowest metallicity bin is only 2.3, which makes its 30-to-8 µm color consistent with that predicted by the highest luminosity R09 template and the hotter S18 template within 1σ. However, we know from previous studies with larger samples and better constraints on the PAH emission of z ∼ 2 galaxies, that the PAH emission at such metallicities at z ∼ 2 is suppressed compared to that of the local LIRG templates (Shivaei et al. 2017) and their IR SED shape resembles that of the local low-metallicity galaxies with low PAH fractions (Shivaei et al. 2020a) . The IR colors of the lowest metallicity stack in this work also overlap with those of some of the local dwarfs (with 12 + log(O/H)∼ 7.8), likely due to the low PAH emission and hot dust component of the local dwarfs that resemble the dust emission characteristics of the z ∼ 2 galaxies with metallicity of ∼ 0.4 Z . In Figure 6 , the models that are matched to the 30-to-8 µm color of the higher two metallicity stacks (12+log(O/H) B18 ∼ 8.5−8.7) overpredict the rest-frame 360 µm flux by at least a factor of 2.5 (4σ) using the R09 and the T = 35 K S18 models, and even more if either of . IR colors from rest-frame 360, 30, and 8 µm emission for the z ∼ 2.3 stacks in this work (large stars) compared to those of z ∼ 0 surveys and the z ∼ 2 ASPECS galaxies. Comparison samples are described in Section 5.1. Crosses show model predictions: The magenta plus sign is the average local low-metallicity template, the small and large orange crosses (connected with a dotted line) show the main-sequence and starburst models of Elbaz et al. (2011) , respectively. Dark red crosses (connected with a solid line) show the R09 models with IR luminosities of 10 11 , 10 11.25 , 10 11.5 , and 10 11.75 L , in order of increasing size. Pink crosses (connected with dot-dashed lines) show the S18 models for two temperatures (indicated on the plots) and two PAH fractions (0.01 and 0.1 for the small and large cross, respectively). The R09 and S18 templates fit to the 24 and 100 µm photometry alone overestimate the observed submm flux and the IR luminosity calculated from the local low-metallicity fit to 24-to-1200 µm data. the Elbaz et al. (2011) templates are adopted. This effect can also be seen in Figure 7 where the models are fit to the observed 24 and 100 µm points alone, overpredicting the observed 1200 µm emission (except for the local low-metallicity template). The middle panel of Figure 6 indicates that the discrepancy between the observed and model predicted 350-to-30 µm colors increases with increasing sSFR, such that the higher sSFR bin has observed colors about an order of magnitude lower than those predicted by the R09 and the T = 35 K S18 templates and that match better with the higher temperature templates, while the lower sSFR bin is discrepant with the two mentioned models only at a 2 − 3 σ level. Metallicity and sSFR are highly correlated with each other and separating their effects on the observed 360to-30 µm color is not possible with the current dataset. The colors of the four ASPECS galaxies with detected observed 24, 100, 1200 µm data are in agreement with our two higher metallicity stacks. The other 6 ASPECS galaxies are only detected in the observed 1200 and 24 µm bands and are shown by horizontal lines (i.e., no constraints on the PACS photometry) -again in agreement with the rest-frame 360-to-8 µm colors of our two higher metallicity stacks. These results indicate that, irrespective of the PACS observations, the locally calibrated LIRG templates anchored to the rest-frame 8 µm flux density may overestimate the submm flux of typical galaxies at z ∼ 2.3 by ∼ 1.5 up to an order of magnitude. In summary, using the local LIRG templates and the rest-frame 8 µm emission alone or the combination of the 8 µm and ∼ 30 µm emission tends to overestimate the submm fluxes of the high-redshift LIRGs. This effect becomes more pronounced with increasing sSFR. Hotter templates (e.g., those with T > 40 K from the S18 library or the R09 templates with log(L(IR)/L > 11.75 R09) predict more accurate submm fluxes based on shorter wavelength data. A similar conclusion holds for the local low-metallicity template derived in this work (Section 3), where the submm fluxes of z ∼ 2.3 LIRGs are well predicted based on 8 or 30 µm emission. However, all templates predict lower 30-to-8 µm colors than the lowest metallicity z ∼ 2.3 galaxies (12 + log(O/H)∼ 8.3). The IR colors of these low-metallicity z ∼ 2.3 galaxies are similar to those of some of the lowest metallicity local dwarfs albeit their 0.2 − 0.6 dex higher oxygen abundances, suggesting very weak PAH emission at gas metallicities of 12 + log(O/H)∼ 8.3 at z ∼ 2.3 (similar to the findings of Shivaei et al. 2017 ). We estimate the total SFR of the sample from dustcorrected Hα luminosity using the Balmer decrement (Section 2.4.1; Table 2 ). As it has been shown elsewhere (Shivaei et al. 2016) , this procedure is a good estimator of total SFR in galaxies that are not heavily dust-obscured, such as those in this study (see sample characteristics in Section 2.1). Comparing the total SFR with the obscured SFR derived from L(IR) (using the calibrations of Kennicutt & Evans (2012) ) demonstrates that there is a significant fraction of unobscured star formation that does not contribute to the IR emission in both metallicity bins. The ratios of SFR(IR) to dust-corrected SFR(Hα) for the subsolar and solar-metallicity stacks in this work are 59% and 57%, respectively. Previous studies have shown that the obscured fraction of star formation (the ratio of SFR(IR) to bolometric SFR) decreases with decreasing mass (e.g., Reddy et al. 2010; Whitaker et al. 2017) , however the ∼ 60% obscured fraction of the our sample, with average stellar mass of log(M * /M ) = 10.18 and 10.43 in the subsolar and solar metallicity bins, is lower than that found previously. For example, Whitaker et al. (2017) predict an obscured fraction of 86 ± 2% at log(M * /M ) = 10.18 from their average fit to the data. However, the tail of their obscured fraction distribution at these masses extend to ∼ 50%. The discrepancy with the average prediction may be due to a bias towards less obscured star forming galaxies in our sample, as the sample is selected to have significant detection in optical nebular emission lines. However, the Whitaker et al. (2017) sample is also selected based on rest-frame optical continuum emission, and presumably biased against heavily obscured systems. The discrepancy may also originate from the way IR luminosity is estimated in Whitaker et al. (2017) . In that work, the authors convert 24 µm fluxes to IR luminosity from a single log average of the Dale & Helou (2002) templates. The use of a single template over all luminosities is an oversimplification. For example, using a single IR template of R09 (e.g., the log(L(IR)/L ) = 11.25 or 11.50 template) to convert the 24 µm flux to IR luminosity, overestimates the IR luminosity by a factor of 2 because the FIR/submm emission predicted by these templates is higher than the observations (Figure 7) . If the unobscured SFR is about half of the obscured SFR (i.e., the obscured fraction is 0.67), then a factor of 2 overestimation in the obscured SFR results in an obscured fraction of ∼ 0.8, which is similar to the discrepancy we see between the low-metallicity obscured fraction and that predicted by Whitaker et al. (2017) . These results indicate that the unobscured SFR may be more significant in the subsolar metallicity and/or high sSFR galaxies at z ∼ 2.3 than has been previously assumed. Popping+2017, z=0 Popping+2017, z=2 Figure 8 . Dust mass versus stellar mass (left) and metallicity (right) for our sample in comparison with other surveys. Dust masses for galaxies in this work are shown by black circles for ALMA detections (SNR> 2) and gray circles with downward arrows as 2σ upper limits for the nondetections. Dust masses from the ALMA stacks in three bins of stellar masses (left) or metallicity (right) are shown with orange squares (Table 3) . Samples from the left panel that do not have metallicity measurements are not shown in the right panel. High-redshift galaxies are on the metallicity scale of Bian et al. (2018) . Semi-analytic models of Popping et al. (2017) at z = 0 and 2 are shown in both panels. At a given stellar mass or metallicity, the z ∼ 2.3 data show about an order of magnitude higher dust mass compared to z ∼ 0. The bulk of dust mass (M dust ) in galaxies is from the cold dust population that dominantly emit at FIR/submm wavelengths, making the RJ emission a good diagnostic for dust masses. Following the discussion in Scoville et al. (2016) , we derive dust masses from the ALMA 1.2 mm flux densities, assuming an optically-thin MBB, as: where S ν is the 1.2 mm flux density (in the observed frame), f CMB is the correction factor for the cosmic microwave background (CMB) effect on the background at the redshift of the targets (Equation 18 in da Cunha et al. 2013) 33 , D L (z) is the luminosity distance to redshift z, B ν (T ) is the Planck function with dust temperature T , κ ν (β) is the dust grain absorption cross section per unit mass at frequency ν with a functional form of κ 0 ( ν ν0 ) β , where κ 0 is the opacity at ν 0 and β is the submm emissivity index. The main assumptions that enter this calculation and contribute to the uncertainties on dust masses are the temperature of the cold component and the submm emissivity and opacity parameters. For the cold dust temperature, we perform a MC simulation by drawing dust temperatures from a Gaussian distribution with a mean of 25 K and σ = 5 K. We assume an emissivity index of β = 1.5, as our subsolar-metallicity data suggest a β shallower than 2 (wider IR peak), in agreement with previous studies (see Appendix B) . Based on the choice of β, an opacity of κ 0 = 0.232 m 2 kg −1 at 250 µm 33 At the dust temperatures and redshifts of the galaxies under study the additional dust heating by CMB is negligible, but due to the effect of the CMB background at the observed wavelength, the observed flux against CMB is ∼ 80% of the intrinsic flux (see Figure 3 in da Cunha et al. 2013). is adopted (Draine 2003; Bianchi 2013) . The κ 0 value is the most uncertain factor in dust mass estimations. Different assumptions of κ 0 from the literature affect dust mass estimations by up to a factor of 3. In Appendix B, we discuss our assumptions on dust temperature, β, and κ 0 and their systematic uncertainties in detail. Figure 8 shows the individual dust mass measurements and the stacked values in bins of stellar mass and metallicity of the galaxies in this work, compared with other samples at z ∼ 0 to 2 (for the description of the comparison samples refer to Section 5.1). The z ∼ 2 samples show consistent dust masses where they overlap in stellar mass. At a given stellar mass, dust mass increases from z ∼ 0 to 2 by about a factor of 10. An increase in dust mass at a given stellar mass has been previously seen at higher stellar masses (Santini et al. 2014; Kirkpatrick et al. 2017) , and is expected owing to the increase in gasto-stellar mass ratio from z ∼ 0 to 3 (Schinnerer et al. 2016; Tacconi et al. 2018; Decarli et al. 2020 ). We also see higher dust masses at a given metallicity at z ∼ 2.3 compared to z ∼ 0 in the right panel of Figure 8 . If the PP04 metallicity scale is used instead of the B18 scale for the z ∼ 2 sample, the redshift evolution of dust mass at a fixed O/H would be even larger. Therefore, the systematic uncertainties associated with the metallicity estimates at high redshifts are unlikely to artificially induce the observed trend in dust mass evolution. To better understand the redshift evolution of dust mass fractions, we plot dust-to-stellar mass ratio (D/M * ) and dust mass to SFR ratio (D/SFR) as a function of metallicity in Figure 9 . As also seen in the z ∼ 0 sample (Rémy-Ruyer et al. 2015) and the simulations (Popping et al. 2017) , there is no statistically strong correlation between the D/M * and metallicity at z ∼ 2. However, there is about an order of magnitude increase in D/M * from z ∼ 0 to 2 across all metallicities that are covered in Popping+2017, z=0 Popping+2017, z=2 Figure 9 . Top: Dust-to-stellar mass ratio as a function of metallicity for individual galaxies in this work and their stacked values in three bins of metallicity. Symbols are the same as in Figure 8 . AGN are excluded from the KINGFISH sample ) and the HRS sample . Bottom: Dust mass to SFR as a function of metallicity for the same objects as in the top panel. Symbols are color coded by the ratio of their SFR to the main-sequence SFR at the given mass. For the z ∼ 2.3 sample, the main-sequence relation of Shivaei et al. (2015b) is adopted (modified for the same IMF as used in this work) and for the z ∼ 0 objects, the main-sequence relation of Salim et al. (2007) is adopted. Semi-analytic models of Popping et al. (2017) at z = 0 and 2 are shown in both panels. this work. In Section 5.4.2, we estimate gas masses from star-formation law scaling relations to compare with our dust mass measurements and discuss the possible explanation and implications of the dust mass evolution. The galaxy formation semi-analytic models of Popping et al. (2017) at z = 0 and 2 are shown in Figure 8 . These models predict less dust mass at z ∼ 2 compared to the observations, particularly at lower stellar mass and metallicities. This discrepancy originates from an underproduction of molecular gas masses of the z ∼ 2 main-sequence galaxies in the models compared to observations (Somerville & Davé 2015; Popping et al. 2019) . Given that in the Popping et al. (2017) model the dust growth mechanism at these metallicities is dominated by the accretion of metals onto grains in the dense ISM, an underestimation of molecular gas mass compared to the observations also results in a lower than observed dust masses. The discrepancy is most pronounced at lower metallicities in the right panel of Figure 8 , which stems from a known outcome of the model that dust-to-metal ratios are lower than the observations at 12 + log(O/H) 8.5 (Popping & Péroux, in prep) . In Figure 9 , the Popping et al. (2017) semi-analytic models are in good agreement with D/SFR-metallicity relation at z ∼ 2, even though they underestimate dust masses at lower metallicities in Figure 8 . In these models, the main mechanism of dust formation at these metallicities is through dust growth in the ISM, which is a strong function of molecular hydrogen surface density and metallicity. On the other hand, SFR is also regulated by H 2 mass through empirical relations. Therefore, although the H 2 gas mass predicted by these models is less than that observed at z ∼ 2, the ratio of dust mass to SFR stays in agreement with observations. In other words, in the models, dust to molecular gas mass and SFR to molecular gas mass relations with metallicity are in agreement with observations, yet dust mass and SFR are lower than those observed at these metallicities. An increase in dust masses from z = 0 to 2 at a given stellar mass is in tension with the apparent lack of redshift evolution in the obscured SFR fraction and IR to UV luminosity ratio (IRX) or UV obscuration (A 1600 ) at a fixed stellar mass over the same redshift range (Bouwens et al. 2016; Whitaker et al. 2017; Reddy et al. 2018a; Shapley et al. 2021 , but Shivaei et al. 2020a found an evolution in IRX at a given UV slope with mass 10 11 10 10 10 9 10 8 10 7 and metallicity, see below). One possible explanation is a change in the dust-star geometry of main-sequence galaxies at z ∼ 2 compared to the local galaxies. The bulk of dust mass is determined by the cold dust population, while the IR emission is dominated by the emission from warmer dust grains in the actively star-forming regions. A two disjoint dust component geometry, where one component is the hot dust in the birth clouds of recently formed stars and the other is the colder dust that resides in the diffuse ISM (Charlot & Fall 2000) , would reconcile the high D/M * of z ∼ 2 galaxies with the lack of evolution in their obscured SFR fraction (and IRX) vs stellar mass relation. There is observational evidence for the emergence of such a two-component dust geometry in subsolar metallicity galaxies at z ∼ 2 from a comparison of the dust reddening of ionized nebular gas and that of stellar continuum (Figure 8 in Shivaei et al. 2020b) . A higher average nebular dust reddening compared to the stellar reddening for subsolar metallicity z ∼ 2 galaxies suggests two different dust populations affect the emission from massive young stars and older stars. Spatially resolved observations (on the scales of a few tens of pc) are required to verify this physical picture. Different dust characteristics (composition and size) with different radiation efficiencies may also play a role. For example, Shivaei et al. (2020a) showed that at a given UV continuum slope β (i.e., stellar continuum reddening, not to be confused with the β that characterizes submm dust emissivity), the lower metallicity galaxies have lower IRX compared to the higher metallicity ones, which may indicate different dust characteristics in low-metallicity galaxies. Other studies have shown similar results that younger and lower mass galaxies at z ∼ 2 have a lower IRX than high-mass galaxies at a fixed UV slope (e.g., Reddy et al. 2012a; Reddy et al. 2018a; Fudamoto et al. 2019) . In this picture, when comparing the z ∼ 0 and 2 galaxies at a given stellar mass, the z ∼ 2 galaxies have lower metallicities, and although they have higher dust masses, dominated by large grains emitting at FIR/submm wavelengths, their light-weighted IR luminosity is not proportionally higher as it is dominated by the re-emitted light from the smaller grains whose characteristics have changed. Another important factor that might have contributed to the lack of a redshift evolution in the IRX (or obscured SFR)-stellar mass relations in the literature is the uncertainties in measuring L(IR) and obscured SFR of low-mass/low-metallicity galaxies at z ∼ 2. For example, in the presence of a significant warm dust component, a cold-dust IR template tends to underestimate the IR luminosity (as discussed in Section 5). On the other hand, obscured SFRs derived from PAH emission could be underestimated if the reduced intensity of PAH emission relative to L(IR) at low metallicities is not considered (Figure 7 in Shivaei et al. 2017 ). Gas masses from star-formation law scaling relations -We first estimate gas masses from the scaling relations of Tacconi et al. (2018) , independent of our dust mass measurements (Table 3) . Tacconi et al. (2018) used a combination of CO observations, dust SEDs, and MBB fits to the RJ tail to estimate molecular gas masses. The gas masses of galaxies with M * < 10 10.5 M at z > 1 in that work are dominantly from either of the latter two methods (no CO observations). For those galaxies, the molecular gas mass is derived from the dust mass, assuming a linear relation between dust to molecular gas ratio Tacconi et al. (2018) molecular gas fraction equation (Eq. 6 in that paper), which is a function of redshift, sSFR offset from main-sequence, stellar mass, and optical effective radius. The main-sequence and size evolution relations adopted in the Tacconi et al. (2018) and metallicity 34 , where metallicities are estimated from a mass-metallicity scaling relation. The Tacconi et al. (2018) scaling relations at higher metallicities (higher stellar masses) are based on direct CO observations and not dust mass estimates from IR/submm data. These estimates are subject to uncertainties in CO-to-H 2 conversions (Bolatto et al. 2013) . A more detailed discussion on the CO conversion uncertainties is beyond the scope of this work and will be addressed in future work where direct CO observations are available. D/M * is proportional to dust-to-gas mass ratio (D/G) multiplied by gas-to-stellar mass ratio (G/M * ). Using the scaling relations of Tacconi et al. (2018) between molecular gas mass, M * , and SFR over a redshift range of z = 0 − 4, we estimate the molecular gas masses of our galaxies based on their redshift, offset from the main sequence 35 , and stellar mass. For consistency, we also calculate molecular gas masses in the same manner for a subset of the DGS, KINGFISH, and HRS samples that have similar average metallicities as our three metallicity bins (12 + log(O/H)∼ 8. 3, 8.5, and 8.7 , for the DGS, KINGFISH, and HRS subsamples, respectively). Assuming no redshift evolution in D/G at a given 34 The Tacconi et al. (2018) molecular gas mass estimates at low stellar masses is similar to that in Leroy et al. 2011 , and perhaps the reason for the better agreement between the two dust to molecular gas ratio estimates at lower metallicities in Table 3 . 35 Here we adopt the Speagle et al. (2014) main-sequence relation to be consistent with that assumed in Tacconi et al. (2018) . metallicity between z ∼ 2.3 to 0 (shown at solar metallicities in Shapley et al. 2020 and Popping et al., in prep) and using the estimated molecular G/M * ratios from the Tacconi et al. (2018) relations as described, D/M * is expected to be ∼ 20 − 50 times higher in the z ∼ 2.3 sample compared to the z ∼ 0 comparison samples. However, our observations show a factor of ∼ 2−10 smaller change in D/M * at a given metallicity. This implies that dust to molecular gas mass is lower at z ∼ 2.3 compared to that at z ∼ 0 by a factor of ∼ 2 − 10. A drastic change in D/G is anticipated for galaxies with different star formation efficiencies (SFE; SFR per unit molecular gas mass), when the metallicities are close to a "critical" metallicity defined by Asano et al. (2013) . Critical metallicity is the metallicity at which dust mass growth in the ISM becomes equal to dust production from stars (AGB stars and SNe). According to the theoretical dust evolution models of Asano et al. (2013) , the critical metallicity itself depends on SFE, and becomes larger with shorter star formation timescales (τ SF , defined as gas mass to SFR). Based on their simulations, at 12 + log(O/H)= 8.4, the DtG of a model with τ SF = 0.5 Gyr is 13 times lower than that of a model with τ SF = 5 Gyr. In the bottom panel of Figure 9 , the clear trend of decreasing D/SFR at a given metallicity with increasing SFR offset from the main-sequence (SFR/SFR MS ; color-coding in Figure 9 , bottom panel) is a result of increasing SFE -both with increasing offset from the main sequence and with increasing redshift. In summary, the higher D/M * at z ∼ 2.3 compared to z ∼ 0 at a given metallicity in Figure 9 can be explained by the higher molecular gas fractions along with the lower D/G (due to the higher SFEs) of the z ∼ 2.3 galaxies. As a comparison, Table 3 also shows gas masses derived from the Schmidt-Kennicutt star formation law adopted from Kennicutt & De Los Reyes (2021) . The single power-law relation in Kennicutt & De Los Reyes (2021) (with slope n = 1.5) represents the overall relation for the nearby starbursts and nonstarbursting disk galaxies. However, the resolved Schmidt-Kennicutt relation has not been tested for main-sequence galaxies at higher redshifts. There are resolved CO and dust observations of z ∼ 1 − 3 galaxies that indicate a different spatial distribution of molecular gas and dust compared to the stellar emission (e.g., Calistro Rivera et al. 2018; Kaasinen et al. 2020) . These observations are typically limited to more massive and/or more actively star-forming galaxies than those in this work. At last, whether the F160W (rest-frame optical) sizes that we adopt here are representative of where recent star formation activity occurs, is another source of uncertainty in the gas masses derived from the Schmidt-Kennicutt relation in Table 3 . Resolved CO and dust observations of larger samples of main-sequence galaxies at z > 1 would help to shed light on these uncertainties. Gas masses from D/G-metallicity relations -Using the metallicity of our galaxies, we also calculate the expected D/G ratios based on the D/G-metallicity relations of Leroy et al. (2011 ), Rémy-Ruyer et al. (2014 ), and De Vis et al. (2019 in Table 3 : the Leroy et al. (2011) relation between molecular gas and dust mass is derived based on the resolved CO, Hi, and IR maps of five nearby galaxies that span an order of magnitude in metallicity from 12 + log(O/H)= 8.0 − 9.0. Rémy-Ruyer et al. (2014) used integrated CO, Hi, and IR data for 126 local galaxies from the DGS, KINGFISH, and a subsample of the sample presented in Galametz et al. (2011) , covering two orders of magnitude in metallicity. The work of De Vis et al. (2019) is based on 466 local galaxies from the DustPedia project 36 with metallicities from optical lines, Hi observations (H 2 is calculated from a scaling relation between the H 2 -to-Hi ratio and the Hi-to-stellar mass ratio), and IR data. In Table 3 , the Tacconi et al. (2018) and Leroy et al. (2011) estimates are for molecular gas fractions, while the Rémy-Ruyer et al. (2014) and De Vis et al. (2019) relations include atomic gas fractions as well. While the molecular gas is expected to dominate the total gas fraction in the more massive and metal-rich galaxies at z ∼ 2 (Tacconi et al. 2018) , the atomic-to-molecular gas fraction increases with decreasing metallicity (Rémy-Ruyer et al. 2014) . This may explain the large difference between the De Vis et al. (2019) predictions and the Tacconi et al. (2018) and Leroy et al. (2011) values at low metallicities, although the latter two are consistent with the estimates of Rémy-Ruyer et al. (2014) . Such discrepancies underscores the importance of CO observations of subsolar metallicity galaxies at z > 1 to robustly constrain their dust to molecular gas ratios, even though the estimates will be subjected to the CO-to-H 2 conversion 36 http://dustpedia.astro.noa.gr/ uncertainties (Bolatto et al. 2013 ). Several studies have attempted to model D/M * evolution in a galaxy (e.g., Asano et al. 2013; Nanni et al. 2020 ) and its evolution with redshift (e.g., Calura et al. 2017; Popping et al. 2017) . To track the time evolution of dust within a galaxy, we look at the D/M * versus sSFR diagram (or dust formation rate diagram) as shown in Figure 10 . We show dust evolutionary tracks from two different models, those of Asano et al. (2013) and Nanni et al. (2020) . The models assume various dust production (stellar origin and dust mass growth in the ISM) and destruction (SN shocks) channels. In the Asano et al. (2013) models, dust production at low metallicity and high sSFR is dominated by stellar sources. Once the galaxy reaches its critical metallicity in the ISM, dust mass growth by metal accretion on the dust grains in the ISM dominates the dust production. At that point, dust mass increases more rapidly than the rate of star formation (hence the steep rise in the DtS). After that, when the majority of metals is locked up in grains, dust mass growth saturates but the stellar mass build up continues, resulting in a decrease in sSFR and DtS. Figure 9 shows that our data, with dust and metallicity measurements, occupies a distinct parameter space with higher sSFR and higher DtS compared to the majority of the z ∼ 0 population. The z ∼ 2 data of Santini et al. (2014) and the ASPECS program overlap with our measurements (but without metallicity constraints). One can argue that an Asano et al. (2013) model with a lower depletion time than that assumed in their work (τ =500 Myr shown by the blue solid curve is the model with the shortest timescale in that work) may cover the z ∼ 2 parameter space. However, typical depletion timescales for these galaxies are not expected to be much shorter than a couple of 100 Myr (e.g., Tacconi et al. 2018; Boogaard et al. 2019) . The z ∼ 2.3 data can also be explained by the Nanni et al. (2020) models that have outflows and a higher SN condensation efficiency compared to the Asano et al. (2013) models. Nanni et al. (2020) assumed a top-heavy IMF as their fiducial model. A typical IMF (e.g., a Chabrier IMF) would shift the curves downward in this diagram. Dissecting the exact theoretical dust evolutionary model that best describes our dataset is beyond the scope of this paper. However, it is clear that given the large number of uncertain parameters in the theoretical models, the combination of dust mass, metallicity, and sSFR at z > 0 opens a new parameter space to constrain the theoretical models with observational evidence. In this work, we present band-6 (1.2 mm) ALMA continuum observations of a sample of 27 star-forming galaxies at z = 2.1−2.5, with robust metallicity and SFR measurements from rest-frame optical emission lines (Hβ, [Oiii] λ5008, Hα, [Nii]λ6585), carefully selected to represent a wide range in metallicity, 12 + log(O/H) B18 ∼ 8.1−8.8, and stellar mass, log(M * /M ) ∼ 9.7−10.6. The galaxies are selected from the rest-frame optical spectroscopic MOSDEF survey and are located in the COSMOS field with a wealth of photometric data covering restframe UV to FIR. Using the Spitzer, Herschel, and ALMA data, we constrain the IR SED and dust masses of the z ∼ 2.3 subsolar metallicity galaxies and compare them with those of local galaxies spanning a wide range in galaxy populations from low-metallicity dwarfs to starbursts. The low metallicity and high sSFR of the z ∼ 2.3 galaxies in this work make them ideal analogs for typical star-forming galaxies in the early Universe. The main conclusions of the paper are as follows. We find that the average IR SED of the z ∼ 2.3 subsolar metallicity galaxies ( 12 + log(O/H) B18 = 8.4) has an additional warm dust component (with peak temperature of about 100 K) that broadens the IR SED, a behavior similar to that of the local dwarf galaxies but different from the commonly adopted LIRG and ULIRG templates (Section 3 and Figure 4 ). The width of the IR SED of the z ∼ 2.3 solar-metallicity galaxies is similar to that of the local LIRGs, as expected (Section 3 and Figure 3) . The IR SED also becomes broader and warmer with increasing sSFR, and increasing SFR surface density (Section 4.2 and Figure 5 ). However, the broadening effect is the most significant when the sample is divided by gas metallicity, such that the ∼ 30 − 80 µm IR SED changes from a dominant single warm MBB component with T ∼ 50 K at solar metallicity to two distinct components with T ∼ 25 and 130 K at subsolar metallicity (Section 4.1 and Figure 5 ). The warm and broad IR SED of the z ∼ 2.3 subsolar metallicity galaxies can be attributed to a more transparent ISM as well as a harder ionizing radiation that increase the dust temperature to higher values and in larger volumes, similar to that of the local dwarf galaxies with ∼ 0.1 − 0.6 dex lower oxygen abundances (O/H). It is also partly due to the higher star formation surface density and sSFR of the z ∼ 2.3 galaxies compared to the z ∼ 0 ones at a given stellar mass, which result in a wider range of interstellar radiation field intensities that dust grains are exposed to, and a higher relative abundance of hot grains in the vicinity of Hii regions, emitting at shorter wavelengths.A more clumpy dust geometry of the low metallicity galaxies may also contribute to the broadening effect (Section 4.2). The presence of a significant warm dust component in the subsolar metallicity and high sSFR galaxies has important implications for deriving IR luminosity and obscured SFR of high-redshift galaxies, as well as for predicting submm fluxes with limited optical to mid-IR data. We show that the locally calibrated LIRG templates overestimate submm ALMA fluxes of z ∼ 2.3 LIRGs by a factor of ∼ 1.5 up to an order of magnitude, if anchored to the PAH emission and/or mid-IR continuum emission (rest-frame ∼ 30 µm; Figures 6 and 7) . The overestimation increases with increasing sSFR. Adopting a broader SED with a prominent warm dust component, similar to those of local dwarf galaxies, alleviate the overprediction of the submm fluxes. On the other hand, the IR luminosity can be underestimated by > 0.4 dex if the cold LIRG templates are used and scaled to a single submm ALMA continuum flux. This is often the case at z > 3, where only a single FIR datapoint is available to estimate IR luminosities. As such galaxies have relatively lower metallicities and are often selected to have vigorous star formations, we recommend to adopt warmer and broader templates to mitigate the underestimation of IR luminosity (Section 5.2). The average local low-metallicity template constructed in this work (Section 3.2) is publicly available 37 . Using the best-fit IR models with the ALMA constraint, we estimate total L(IR) and obscured SFR for the sample. Comparing the obscured SFR to total SFR derived from optical emission lines, we find that the obscured SFR of the subsolar metallicity galaxies with average stellar mass of log(M/M ) = 10.18 is ∼ 60% of the total SFR. This is 1.5 times lower than the average value indicated in previous studies for the general population of galaxies at this stellar mass (Section 5.3). We find that dust masses (derived from ALMA 1.2 mm data) are about an order of magnitude higher at z ∼ 2.3 compared to z ∼ 0 at a given stellar mass and metallicity. An order of magnitude higher dust-to-stellar mass implies an order of magnitude higher gas-to-stellar mass fractions at z ∼ 2.3 compared to z ∼ 0, if dust-to-gas mass ratio stays constant. However, CO-based studies (Tacconi et al. 2018 ) predict a larger dust-to-stellar mass evolution for the star-forming galaxies in this work, which suggests that dust in these galaxies is depleted (i.e., lower dust to molecular gas mass ratios) by a factor of > 2. A lower dust-to-gas ratio can be a result of a high star-formation efficiency in the galaxies in this sample (Section 5.4). The higher dust-to-stellar mass ratios of z ∼ 2.3 galaxies compared to z ∼ 0 is potentially in tension with the apparent lack of redshift evolution in the obscured luminosity (or SFR) fraction at a given stellar mass. Given that the bulk of dust mass is from the cold dust population, while the IR emission is dominated by the emission from warmer dust in the vicinity of actively star-forming regions, this finding hints at different dust-star geometry in z ∼ 2.3 main-sequence galaxies compared to that of the local galaxies (Section 5.4). Many of the existing empirical IR SED templates and gas mass calibrations at z > 1 are based on high mass, high metallicity galaxies. This study pushes into the important territory of lower metallicities and finds key differences in the dust properties of subsolar metallicity galaxies at z ∼ 2.3 compared to their more metal rich counterparts. These differences are crucial to be considered for future observations with IR facilities and motivate further detailed studies of gas and dust in the lower metallicity regime at high redshifts. For instance, better sampling of the FIR/submm emission of galaxies with ALMA and LMT/TolTEC, over a larger wavelength range, would help to set observational constraints on the cold dust temperature and dust emissivity index of subsolar metallicity galaxies, which are not well constrained by the single ALMA band observations presented in this work. Moreover, the relative contribution of PAHs and very small grains to the mid-IR spectra remains unconstrained with only a single photometric data point from Spitzer/MIPS 24 µm. Future studies with JWST/MIRI will be able to break this degeneracy owing to the multiband imaging and spectroscopy capabilities of MIRI. Additionally, future and ongoing CO observations of subsolar metallicity galaxies at z ∼ 2 are crucial to better constrain their molecular gas mass fractions and the redshift evolution of dust-to-gas mass ratios as a function of metallicity, providing key constraints for theoretical galaxy evolution models. We thank the referee for a very constructive review. We thank Luke Maud from the European ALMA Regional Center at the European Southern Observatory (ESO) for valuable discussions and his help with running the ALMA calibration and imaging pipeline. We thank Loreto Munoz and Ryan Loomis at the North American Alma Science Center for valuable input on image construction and flux measurements of the ALMA data. We also thank Tanio Díaz-Santos for providing the GOALS survey data catalogs. Support for this work was provided by NASA through the NASA Hubble Fellowship grant #HST-HF2-51420, awarded by the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Inc., for NASA, under contract NAS5-26555, to IS. This paper makes use of the following ALMA data: ADS/-JAO.ALMA#2019. In this appendix we describe the IR templates that are used to fit the 24-to-1200 µm data by weighted least squares technique, where the weights are the inverse of the variances of the measurements. These templates are chosen to represent those commonly used in the literature. The Rieke et al. (2009, R09) SEDs are average templates for purely star-forming galaxies with L(IR) between 5 × 10 9 to 10 13 L . These empirical templates are constructed based on the IR emission of local star-forming galaxies, LIRGs and ULIRGs. On average, from lower to higher luminosity templates, the mid-IR aromatic emission becomes stronger and the far-IR peak wavelength (proportional to luminosity-weighted dust temperature) shifts to shorter values (higher temperatures). It has been previously shown that the rescaled R09 L(IR)= 10 11.00 − 10 11.75 L templates describe well the IR emission of ULIRGs at z ∼ 1 − 4 (e.g., Rujopakarn et al. 2013; De Rossi et al. 2018; Shivaei et al. 2020a) . Following the results of these studies, we use the R09 L(IR) = 10 11 − 10 12 L templates in this work. The two-temperature modified blackbody (2T-MBB) fitting of Kirkpatrick et al. (2015, K15) assumes two MBB components with different temperatures for the far-IR (> 20 µm) emission. We do not use these models to fit the mid-IR emission. This simplified model is useful for quantifying the relative temperature and strength of the warm and cold dust components. However, in case of limited data such as ours, a simplified 2T MBB approach would give us insight on the physical parameters governing the far-IR emission, i.e., the properties of the average cold and warm dust components. The 2T MBB has the form of S ν (ν) = a w ν β B ν (ν, T w ) + a c ν β B ν (ν, T c ), where β is emissivity index and B ν (ν) is the Planck function, T is the temperature in K, a is the normalization, and the w and c subscripts indicate the warm and cold components, respectively. We fix β and T c to simplify the model, as there is only a single ALMA point which is insufficient to constrain these two parameters. We set β to 1.5 as suggested by K15. There is evidence that the temperature of the cold dust component that determines the RJ emission tail is almost constant in different types of galaxies and redshifts. Rémy-Ruyer et al. (2015) calculated median dust temperatures of 25.9 K for the local low-metallicity galaxies of the Dwarf Galaxy Survey (DGS; Madden et al. 2013 ), while they found evidence for an additional hotter dust component with average temperature of 105 K for a fraction of the DGS galaxies. They found a median dust temperature of 25.7 K for the local higher metallicity sample of KINGFISH, without any additional warm component. K15 also showed a cold dust T = 25.8 − 28.1 K component in their star-forming dominated templates at z ∼ 1. Based on these findings, we set the cold dust temperature to T c = 25 K in our fits. As our goal in using the K15 2T MBB fitting procedure is to quantify differences in the shape of the IR SED between the low and high metallicity bins, and not to derive a template, adopting a different T c does not change our conclusions. For example, adopting a T c = 30 K increases the best-fit T w by ∼ 5 − 7 K, keeping the qualitative conclusions intact. A3. The L16 low-metallicity templates To complete the set of IR templates, we also use the IR SED library of 19 local low-metallicity galaxies from Lyu et al. (2016, L16) . The data are from the DGS survey . L16 selected 19 DGS galaxies that had high enough SNR in Spitzer/IRS spectra and constructed the templates by combining the IRS spectra with Herschel and WISE photometry ( Table 7 in L16). The main difference between their SEDs and those presented in Rémy-Ruyer et al. (2015) is that L16 adopted the powerlaw+MBB fitting procedure of C12, while Rémy-Ruyer et al. (2015) used the semi-empirical dust SED models of Galliano et al. (2011) . The advantage of the C12 model in this case is that, owing to the well-sampled mid-IR emission of these 19 galaxies, L16 were able to relax the bounding condition of the powerlaw turnover wavelength, λ c (see the discussion in Section 3.2). As a result, they were able to successfully capture the flux density excess at λ 50 µm with their fits, while Rémy-Ruyer et al. (2015) had to invoke a second MBB with a hotter temperature to fit the flux excess. To derive dust masses from RJ emission based on Equation 3, we need to assume a cold dust temperature, the opacity at a reference wavelength κ 0 , and a submm emissivity index β. Here, we discuss each of these assumptions in our calculations and their associated systematic uncertainties. Cold dust temperature -For the cold dust temperature, we perform a MC simulation by drawing dust temperatures from a Gaussian distribution with a mean of 25 K and σ = 5 K. This assumption takes into account the scatter in mass-weighted dust temperatures based on the results of previous studies with better submm wavelength coverage both at z ∼ 0 down to low metallicities and for massive galaxies at higher redshifts (Cortese et al. 2014; Kirkpatrick et al. 2015; Rémy-Ruyer et al. 2015; Scoville et al. 2016; Faisst et al. 2017) . These studies show that the mass-weighted temperature varies less than the luminosity-weighted dust temperature derived from shorter wavelength data. As the mass-weighted dust temperature is the relevant quantity to derive dust masses from RJ emission, we use T = 25 ± 5 K in Equation 3 and do not consider the hotter dust component that appears in Figure 5 . Dust mass inferred from the RJ emission is only linearly dependent on dust temperature (M dust ∝ 1/T ), and hence, the relative contribution of a possible warm component dominating the FIR luminosity would not be significant. For example, in the top-right panel of Figure 5 , in which the warm component has the most significant flux contribution to the RJ emission compared to the other bins, if we assume half of the 1.2 mm flux originates from the warm component with T = 54 K, the dust mass would be lower by only 35%, which does not change any of the main conclusions in this paper. Emissivity Index β -For β, we take the average of our best fits to the solar and subsolar metallicity stacks from the earlier sections in the paper, β = 1.5. An emissivity index of 1.5 − 2.0 has been previously observed in moderately lower metallicity local galaxies, for example in the outer disk of M33 (Tabatabaei et al. 2014) , in the Magellanic Clouds (Planck Collaboration et al. 2011) , in local galaxies (Cortese et al. 2014; Faisst et al. 2017) , and in a few higher redshift galaxies (Faisst et al. 2020) . Our fits to the subsolar-metallicity stacks strongly prefer an even lower β = 1. The emissivity index derived from SED fitting is in fact an effective emissivity index, which can be different from the intrinsic emissivity of the cold grains (Dunne et al. 2000; Galliano et al. 2011) . If there is a distribution of temperatures (which is realistically always the case for integrated observations of high-redshift galaxies), the SED will be broadened and the effective β will be lower. In our case of subsolar metallicity observations, because of the presence of a significant warm component, if we use the flexible models of Casey (2012) , the best-fit model with the least χ 2 is the one with lowest allowed β, as it provides the widest possible SED to account for the excess emission at ∼ 30 µm. The best-fit β value in the Casey (2012) models also depends on the pivot wavelength assumption (λ c ), which determines the transition point between the MBB and power-law components. λ c is fixed in our fits based on the suggested value in Casey (2012) that is derived from a sample of GOALS galaxies. Therefore, the validity of this assumption for our galaxies is in question, which in turn affects the best-fit β. These degeneracies can only be diminished by a better sampling of the IR SED. In short, the best-fit emissivity index of 1 in our subsolar-metallicity fits is a lower limit on the intrinsic emissivity, which is likely lower than that of the solar-metallicity galaxies. Although it is common practice to assume the Galactic emissivity index of β = 2, it is shown in the resolved studies of LMC that the intrinsic β can be lower than 2 . Opacity Coefficient κ0 -The absorption cross-section per dust mass or dust mass opacity coefficient is the main source of uncertainty in dust mass estimates from FIR/submm wavelengths. The absorption cross-section is a function of wavelength in the form of κ = κ 0 ( λ λ0 ) −β , where κ 0 is the normalization at a reference wavelength λ 0 . The reference normalization value is notoriously uncertain. It depends on the grain composition, size distribution, and many other factors that make the results of models that are even based on the same dataset significantly different from each other. As a result, dust masses derived from different models can vary by up to a factor of ∼ 3. In addition to the β dependence in the functional form of κ, κ 0 itself depends on β Bianchi 2013) . Grains with smaller emissivity index have larger submm opacity. Therefore, the assumption of β in the dust model (or the MBB functional shape) should be consistent with the assumption of β that the κ 0 is based on. In fact, Bianchi (2013) shows that if β is kept consistent, dust masses are not significantly dependent on the assumption of β. Another source of uncertainty in κ 0 is the possible variations in the ISM and dust emission properties of the systems under study compared to those in the local MW, on which the κ 0 calibrations are often based (Dwek et al. 1997; Li & Draine 2001; Draine 2003; Zubko et al. 2004) . In these studies, the absorption cross-section per H atom is derived from the emission properties of the MW cirrus, and then converted to an absorption cross-section per dust mass by adopting a MW gas-to-dust ratio. Bianchi et al. (2019) uses a different method (also used by Clark et al. 2016) to derive dust masses from integrated measurements of gas masses, metallicities, and FIR observations for a local sample from the DustPedia project. The study shows a large range of κ 0 values that vary galaxy-to-galaxy, and are mildly dependent on metallicity. Their results are conditional on the assumptions made for the fraction of metals in dust, the dust heating conditions, and dust-to-gas mass relation with metallicity. We adopt the κ 0 = 0.4 m 2 kg −1 value of Draine (2003) at 250 µm for β = 2.08 (Bianchi 2013) and use the conversion factor of 0.58 from Bianchi (2013) to calibrate it for β = 1.5. That is, we assume a κ 0 = 0.232 m 2 kg −1 as the opacity at 250 µm for an emissivity index of 1.5. A β = 2 assumption would decrease the estimated dust masses by ∼ 30%. The HRS dust masses and the dust masses of Santini et al. (2014) are also based on the Draine (2003) It is previously claimed that dust masses derived from a single MBB tend to be lower than those derived from IR SED fitting (e.g., see Rémy-Ruyer et al. 2015) . To assess the degree of such bias in this work, we estimate the dust masses of the z ∼ 0 DGS, KINGFISH, and HRS galaxies from their observed SPIRE 350 µm fluxes using Equation 3, and compare them with the estimated dust masses of the local samples derived from the IR SED fitting in Rémy-Ruyer et al. (2015) and Cortese et al. (2014) . Following the previous discussion, we make the same assumptions as done for the z ∼ 2.3 sample: a T = 25 ± 5 K and two different β (and κ0) assumptions of β = 1.5 (κ0(250 µm) = 0.232 m 2 kg −1 ) and β = 2.08 (κ0(250 µm) = 0.4 m 2 kg −1 ). The assumption of β = 1.5 (β = 2.08) result in dust masses that are on average 0.15 (0) Earth, Planets, and Space Astronomical Society of the Pacific Conference Series ALMA Technical Handbook