Shen-Orr et al. reply:

We appreciate the comments made by Zhong and Liu and their hard work on the proof1. Indeed, removing unneeded normalization methods, including log transformation, can yield even better linearity results, optimizing the use of deconvolution methods.

Although we would expect that better deconvolution methodology will be more sensitive for detecting cell type–specific differences between groups, empirically we have found that this is not always the case. Cell type–specific significance analysis of microarrays (csSAM) compares expression between two groups2, the gene expression data of each of which is separately deconvolved to yield cell type–specific expression. The false discovery rate for cell type–specific differences between groups is assessed via permutations, an expected side effect of which is the reduced effect of systemic biases, as those are controlled for statistically. We found that in the complex context of actual sample data, use of a log transformation on the measured 'raw' gene expression data input into the linear csSAM deconvolution model often yields improved (lower) false discovery rates between groups than when the raw data are kept as is or are log-transformed after deconvolution. Such is the case for the acute-rejection versus stable individual data we discuss in the publication2 (Supplementary Fig. 1). A possible reason for this may be that as a function of the technology used, actual transcript abundance may be separated from what we consider as 'raw' measured gene expression by intermediate steps (for example, labeling, hybridization and scanning in the case of microarrays), which may affect linearity. Thus, we would recommend that users of csSAM try different choices of transformations, guided by the visual appearance of the results and the estimated false discovery rate discovery rate.

The latest update of the csSAM R package as well as a Microsoft Excel Add-In are available at http://buttelab.stanford.edu/public:data. They include added functionality that allows effortless switching between log-transformed and anti-log-transformed gene expression values when performing either the deconvolution or comparative expression steps of csSAM.