** We thank the referee for the careful review and for the constructive suggestions. Our responses are included below. S3.1 P5 - the authors state that a new sky estimation and editing of foreground and background sources was used. What are they? Is this what you describe in {section sign}3.4? ** Yes, the procedures are described in Section 3.4. We have added a reference to Section 3.4 in the description of Section 3.1. S3.2 P1 - the authors state that deeper observations of Holmberg II allowed for the measurement of diffuse emission. Does this mean that the aperture for HolmII was increased, thus allowing for the capture of additional flux? If so, aren't you using the same aperture for all bands? If not, that statement doesn't make a whole lot of sense. The source has some true flux, where low surface brightness flux is hidden by noise. The true flux is always present, and then the noise decreases with a deeper observation, revealing it. So, a better wording would be "deeper observations have allow for a more robust measurement, and in this case one that was N-sigma brighter than the estimate in Dale et at. (2010).", or something to that effect. ** We agree and have improved the description of the impact of deeper PACS imaging. S3.4 P1 - "emission ... is removed before the sky is estimated" How is it removed? Do you fit a profile, mask to some isophote, etc? Or did you use the same as in the comment below? ** The referee has pointed out a redundancy in the two paragraphs; yes, we used the same procedure referred to in the below comment. For clarity we have removed the first description of "sky cleaning" and restructured the subsection. S3.4 P2 - "removal is accomplished via ... the sky annulus" Performing an annular sky estimate such as this will return you a RMS estimate that is systematically lower than that which you measure using your sky apertures, because the annular estimate is agnostic to correlations in your sky. Furthermore, random draws from a normal distribution with sigma = annular sky RMS will also fail to capture the impact of correlated sky. Do you have an idea of how correlated your sky pixels are per band? You could check this by just comparing the sky RMS measured in your annulus to the sky RMS measured in your apertures. Finally, this process assumes that there is no source flux intersecting with your contaminant flux. So, if the contaminants are small compared to your targets and/or are exclusively in the very low surface brightness parts of your apertures, then this procedure should be fine. Is this the case? If not, how much does this segmented deblending impact your flux estimates? ** The referee raises some good points. Fortunately, the contaminants are always much smaller than our target galaxies, and the vast majority are projected to lie in the galaxies' extended low surface brightness areas, and thus the potential impact of correlated sky pixels is limited. We have added a note on this to the text. S3.5 P2 - "No aperture corrections were applied to the WISE photometry as they are negligible for the large apertures used here." Is this true? The aperture corrections that you implement for Spitzer (assuming that they are similar to those in Dale et al. 2007) are roughly in the range 1-10%. The WISE PSFs have wings that extend to many tens of arcsec, and a cursory check of the PSFs (on the WISE All-Sky Release Explanatory Supplement) suggests that a circular aperture of radius 40" (ie. M81dwA; your smallest source), will have a W1 aperture correction of ~4%, and a W4 aperture correction of ~8%. If those numbers are correct, and you are including Spitzer aperture corrections on the same order, should you really wave off the correction in WISE? ** We enlisted the help of WISE gurus (and co-authors) Tom Jarrett and Michael Brown when grappling with the WISE calibrations and aperture corrections. The WISE documentation at http://wise2.ipac.caltech.edu/docs/release/allsky/expsup/sec4_4c.html#apcor indicates aperture corrections of 3-4% for large aperture photometry, consistent with what is reported in Section 3.5 of Jarrett+2013 http://adsabs.harvard.edu/abs/2013AJ....145....6J However, the WISE aperture corrections are even smaller than 3-4% since we use native resolution imaging (and not the Atlas release images). Given these caveats we prefer to not invoke aperture corrections to avoid potential over-correction. We now mention this caveat explicitly in the text. S3.5 P-2 - Upper Limits. It is this referees opinion that the choice of providing upper limits is a poor one. In the case when you have chosen explicitly to use a single aperture for every band, what benefit is there in choosing to provide upper limits over forced measurements? i.e. simply specify the same "value \pm uncertainty" for every target that you have data for, regardless of whether is has a >1 sigma measurement or not. The forced measurement encodes substantially more information, makes your dataset much more useful to other parties, and (if the end user so desires) can be converted into an upper limit anyway post-facto. Choosing to discard the flux information encoded in your forced measurement, that you've taken the time to measure, just because it has a 0.999 sigma significance (for example) seems wasteful. If you do still want to specify an upper limit for your sources, shouldn't this be based on the total uncertainty, rather than the sky uncertainty alone? ** We have added a new table that provides aperture photometry for cases where we list upper limits in the main flux table. One benefit to utilizing upper limits in the main flux table is that it reflects the authors' best assessments on whether detections are achieved, assessments that leverage information at other wavelengths. For example, for faint sources that are apparently undetected at the longest wavelengths, sky fluctuations that are not spatially aligned with the emission at shorter wavelengths can yield a false "detection" that is nevertheless nonsensical. S4.1 P1 L2 - Why do you opt to provide the fluxes in this paper without correction for Galactic extinction? Presumably all of the figures that you show (i.e. colours and SEDs) use the corrected fluxes, so why not provide the corrected fluxes in the tables also? I notice that you chose to correct for extinction in Dale et al. (2012), is there a particular reason that you chose to change tack? ** We have encountered instances where users from the community have incorrectly undone our extinction corrections. Thus we have chosen to provide fluxes not corrected for MW extinction to minimize confusion. S4.1 P2 L5 - 'a second grouping of B data points' It seems as though about half of Dale07 data points that are in this group! Is it fair that this calibration error would effect such a large portion of the sample? Regardless, it's probably worth explicitly stating the fraction of Dale07 B-band fluxes that are off by such a large, constant, fraction. If nothing else this will further motivate the community to ensure they use your new, better, fluxes. ** We now state this fraction in the text. S4.2 P-1 L-5 - You specify that you are using the same DL07 dust opacities in the Figure 5 comparison. Are the SEDs you show using the updated opacities? Or do they too use the old ones? This is probably worth stating explicitly. ** We have restructured that sentence to more clearly indicate that we are still using the models from Draine & Li (2007). [ The updated models are not yet available. ] S4.3 - The authors attempt to demonstrate a measured sub-mm excess at 500um, albeit at low significance. There are a few things that I think require thought: E6: The parameterisation of the excess P2L8: The exploration of an excess at 500um only P2L9: The use of `secure 500um detections' E6: The excess parametrisation: Parametrising the excess in terms of the modelled flux seems like a strange choice. Yes, it provides an indication of whether or not the observed excesses are a constant fraction of the expected/modelled flux, but it doesn't provide any indication of the uncertainty on each of your measured excesses. Why not also describe the excess as a sigma with respect to the model and observational uncertainties? This value is then a direct representation of the tension between the excess and the fit, which I think is a more useful quantity when determining the robustness of your excess. Having both these values seems more reasonable to me, and I think the latter more naively interpretable. ** We now include a measure of the excess normalized by the observational uncertainty. P2L8: The excess at 500um only: You argue that this excess is in line with excesses found in other works such as, eg, in the analysis of M33 by Hermelo et al. (2016). While Hermelo demonstrated the excess persists in all bands in the range 500um < lambda < 3mm, the analysis here is restricted to the 500um band. Admittedly this is likely due to low number statistics in the 850um band, but given that the very cold dust component would be more prominent at 850um than at 500um it seems remiss to not check whether the excess is seen there also. If so, great! If not, why? This is particularly relevant given that, visually, 17/27 of the SCUBA data points fall below your extrapolated dust-model. If the excess is not present at 850um, does this mean that the excess is indicative of a problem with the 500um fluxes? Or is there a systematic bias in the 850um fluxes? ** The referee raises an excellent point. We have not incorporated the SCUBA 850um data points in any SED fitting or submm excess analysis since those data have limitations. First, some of the SCUBA maps are small enough to require estimated aperture corrections of up to a factor of 2.2 (Dale et al. 2005). Second, the SCUBA observations were either carried out in jiggle-chop map mode or scan map mode; it is difficult to detect emission beyond 2'-3' in jiggle-chop mode, and large-scale structure is still missing even for scan mapping observations (e.g., Johnston et al. 2000; Stevens et al. 2005). ** To help address these concerns we have added lower-resolution Planck 850um fluxes to our tables and SED figures. We have added a subsection that introduces these data and where we caution that the lower resolution of the Planck 850um data may lead to contamination by background/foreground sources. P2L9: `Secure 500um detections': This is undoubtedly a very small effect given your sample, but nonetheless: the choice of using sources that are explicitly strongly detected in this analysis introduces a Malmquist bias that increases your likelihood of measuring an excess, as the noise in the 500um band is higher than that in the adjacent 250um and 350um bands that constrain most of the cold component of your fit. It seems like a waste to not use all of your measurements, with relevant uncertainties, and avoid this bias entirely. You can do this if you use the forced measurement instead of the limits. I expect that adding in these low sigma data will not change your results, and including them increases the robustness of your (weighted? It should be weighted.) mean excess. ** We have added a note of caution that our results may be biased due to our restriction to securely-detected sources. --------------------------------------------------------------------- Comments on Figures: Figure 1: Showing the uncertainties on your data points might be nice given that they span the colour-magnitude space well. This would give a nice indication of the confidence of the sources that are fairly extreme outliers to the SDSS distribution. ** Done Figure 2: The addition of a characteristic uncertainty in each panel is needed. Otherwise it's impossible to tell what's a significant outlier or otherwise. An additional axis value on the left margin would also be helpful, rather than just "1". ** We have added error bars as well as additional y-axis enumerations. Figure 3: Including the additional Magnitude axis as in Figure 2 would be very useful. As with Fig. 2, another axis label on the left margin would also be helpful. Worth including in the caption that the constancy of the error bars as a function of flux is because your fluxes are dominated by the absolute, relative, and extended source calibration uncertainties. ** We have added additional enumerations to the lefthand axes but prefer to omit using magnitudes along the righthand axes since that is an uncommon unit for the far-infrared/submillimeter. We have added a note to the caption about the relative constancy of the error bars. Figure 4: Plotting SEDs is always tricky, because you can hide a lot of evil in 6 dex of dynamic range. As such, it's common practice to include a residual panel along with each SED. I appreciate that in your case this would likely become cluttered very fast, but given that you have ~1/3 of a page of white-space per each Figure 4 panel I think it can, and probably should, be done. However, I will leave this up to the authors. At the least your should include some indication of the uncertainties. If they are smaller than the data points, then say so. ** We now mention in the caption that the uncertainties are smaller than the data points. You have a few SEDs where there appears to be no fit. Are there fits to these data that are just very unconstrained? Or do you not have fits because you're using upper limits? Either way it's probably worth adding a note to the caption whether or not you have dust estimates for these sources (even if it is just a limit). Incidentally, (again) if you use forced photometry then you should be able to get a perfectly reasonable estimate and uncertainty, rather than nothing at all. ** We have added a note to the caption that explains there are no fits in cases with FIR non-detections. As explained above, we have taken the referee's suggestion of including a table with forced photometry for cases of non-detections. However, we prefer to not fit these data especially since many of the forced photometric data points are negative. What's going on with NGC3034? Are the Spitzer data saturated? And if so, why isn't this indicated in the table data? And why is there no fit? ** Good catch. We have added a note to the table that the Spitzer MIPS data are unreliable due to the effects of saturation. We now include a dust SED fit that excludes the MIPS data. In Dale et al. (2012) you mentioned that the SED fits to NGC0584, DDO053, and M81dwB, were poorly constrained because of large quantities of MW dust along the line of sight. Is that still the case here? Or are you confident that your updated methods (sky estimate etc) have circumvented the problem? ** We essentially have the same concerns for the far-infrared photometry for these targets, though the addition of WISE data gives us incrementally more information for the SED fits. We have added a small discussion related to this aspect in Section 4.2. In your figures the spitzer arrows should be red so that they stand out from the Herschel arrows. ** Agreed. Done. Figure 5: Your figure and caption do not match. What is sigma? Is it the residual RMS? There is a lot of whitespace in this figure too. In general, I find "value vs value" plots are typically quite wasteful, because any interesting trends are lost in the large dynamic range. You'd be better served by having "sigma residual vs value" figures instead, especially if you're trying to demonstrate subtle differences in the datasets. By sigma residual I mean (a-b) / uncertainty on (a-b). ** We have reconfigured the plot and its caption. The figure now plots the difference in the values vs the values from this work. Figure 6: Your characteristic oxygen abundances should include an uncertainty from the uncertainty in the Moustakas et al. (2010) fits. Or you should justify why you do not need to show them. Ideally you would also show the uncertainties in both dimensions in both panels, but I appreciate that this would get cramped. ** We have added error bars to the metallicities. Figure 7: Is the sigma you quote the population RMS? If so, I think that the uncertainty on your mean would be a more useful quantity, as it will showcase any tension between the expected ratio and that observed. ** We have replaced the population dispersion with the uncertainty on the mean, and indicated so in the caption. --------------------------------------------------------------------- Typing errors and other details: S3.2 P2 L1 - "Level 0 to Level 1" and "Level 1-to-Level 2"; Choose one format ** Done S3.2 P3 L3,L4 - Using an actual square for square-arcsec units seems very strange. Is that the appropriate ApJ style? I think simply arcsec$^2$ would be much nicer. ** Changed. S3.2 P4 L2 - your \approx should be \sim, given that it's a combination of two previous \sim values. ** Changed. ---------------------------------------------------------------------