1) Since I was a wee astronomer I was taught that not using "relatively" is a virtue, as it is a word with hardly any information content. Your paper is far from virtuous according to that metric. I counted 9 instances, may have missed some :-) I think it can be removed from all those sentences with no loss of information to appease the ghosts of Strunk & White. *** N(relatively)=10! I chopped out seven of them. I also have a rather self-serving suggestion for a much neglected reference to an observation of a submm-mm excess in a low metallicity galaxy with a reasonably serious attempt at understanding the cause that predates the list you now have in the intro: Bolatto et al. 2000, ApJ, 532, 909. *** Thanks. It's now included. 2) S3.2, "SPIRE observations for six of our galaxies were observed..." Maybe obtained? *** Fixed 3) S3.4, apertures are "chosen by eye" to encompass all emission, but then there is an aperture correction (i.e., there is emission that is missed). That seems a contradiction. Perhaps a little rephrasing would help. 4) Is the identification of background sources done independently for each band, or jointly? At 500 um one may be more likely to miss background galaxies because of the poor resolution, and that may be important for those dwarfs you point out have significant contamination. It may be also important to quantify the 500 um excess properly (more on that later). *** Jointly 5) I understand why you use it, but the 3.6 um seems like a really bad template for the ISM. You give the caveat, and since the correction is small that is probably OK. 6) S4.1, reddening curve of Li & Draine (2001). Maybe it'd be good to mention out to what wavelength you corrected for Galactic extinction. Also, do PACS and SPIRE use the IRAS system or the MIPS system? (you say no color corrections are applied, I just what to know how large they are likely to be). *** I added a footnote quantifying the reddening corrections. The MIPS calibration assumes f_nu \propto nu^2, whereas the PACS and SPIRE calibrations assume f_nu \propto nu^{-1}. IRAS assumes f_nu \propto nu^{-1}. I accounted for these differences in the comparison. 7) I like the nice concise explanation of the Draine model. Note however an inconsistency: in pg. 12 we say that dust with U>Umin is the PDR component, but in the Summary (before eq. 9) we say that U>100 for fPDR. *** Yeah, Wolfire pointed that out as well. I was following the definitions laid out in Draine & Li 2007 and Draine et al. 2007, but as you point out, there is an inconsistency. I've removed "PDR" from the discussion surrounding Equation 9, but I still want to compute the U>100 fraction for consistency with the results published in Draine's SINGS paper. 8) S4.4: In discussing Fig. 4, you seem to miss the most apparent point, which is that there are clear systematic trends with 70/160 "temperature" and the largest systematic differences occur for "cooler" galaxies (maybe that isn't true, but certainly the scatter is much less for cooler galaxies). *** Hmmm ... There is no trend for gamma, and I'm not sure about the others. But I agree that the dispersion is smaller for cooler galaxies. Text modified accordingly (in a couple spots). 9) In the second paragraph we say that masses for metal-rich galaxies get smaller with Herschel. Then we say that Galametz finds that metal-rich galaxies have the SED peak beyond 160 um (i.e., they are cool), so Herschel is necessary. To me this is a bit of a puzzle: it seems to suggest that without Herschel one tends to think the dust in on average colder than it really is, so assign the galaxies bigger masses. I don't know why the system would behave that way. But one thing is clear: looking at Figs. 4 and 5 it is transparent that we are exchanging Umin for mass (the plots are essentially mirrored). So what the inclusion of Herschel is doing is driving Umin to larger values for metal-rich galaxies, which means that we need less dust mass. It is also driving Umin to smaller values for low metallicity galaxies, which increases their dust mass. And I image that is all due to the delta(U-Umin) in Eq. 4, which encompasses most of the dust in the system (Gamma changes a bit in the fits, but probably not enough to compensate). I'd have expected Umin to be mostly set by the peak of the SED, but it seems that it's really set by the RJ tail that comes from Herschel. I haven't played with the fits, so I don't really know how orthogonal Umin is to the rest of the free parameters, but I wouldn't have necessarily predicted this behavior :-) *** The mirroring of U_min and M_dust is based on the fact that I compute M_dust using Equations 33 & 34 of Draine & Li 2007. I'll need to check with Bruce and see if he thinks it should be calculated differently. 10) S4.5. People don't necessarily use modified BB fits because they are "quick and simple". They use them because they don't have the entire SED that is needed to fit a Draine model! 11) For the BB fits, an important point is how are the SED points weighted in the fit. *** They are weighted by their uncertainties. I added that nugget to the figure caption. 12) Isn't it surprising that using the wrong beta (beta=2 instead of beta=1.6) gives a better result (only 20% off, instead of 70%). *** The temperature drops a bit in the beta=2.0 and 70