Abstract

We introduce CIBERSORT, a method for characterizing cell composition of complex tissues from their gene expression profiles. When applied to enumeration of hematopoietic subsets in RNA mixtures from fresh, frozen and fixed tissues, including solid tumors, CIBERSORT outperformed other methods with respect to noise, unknown mixture content and closely related cell types. CIBERSORT should enable large-scale analysis of RNA mixtures for cellular biomarkers and therapeutic targets (http://cibersort.stanford.edu/).

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Accessions

Primary accessions

Gene Expression Omnibus

References

  1. 1.

    & Hallmarks of cancer: the next generation. Cell 144, 646–674 (2011).

  2. 2.

    , & Neutralizing tumor-promoting chronic inflammation: a magic bullet? Science 339, 286–291 (2013).

  3. 3.

    & Computational deconvolution: extracting cell type-specific information from heterogeneous samples.. Curr. Opin. Immunol. 25, 571–578 (2013).

  4. 4.

    , , , & Deconvolution of blood microarray data identifies cellular activation patterns in systemic lupus erythematosus. PLoS ONE 4, e6098 (2009).

  5. 5.

    et al. Optimal deconvolution of transcriptional profiling data using quadratic programming with application to complex clinical blood samples. PLoS ONE 6, e27156 (2011).

  6. 6.

    et al. PERT: a method for expression deconvolution of human blood samples from varied microenvironmental and developmental conditions. PLoS Comput. Biol. 8, e1002838 (2012).

  7. 7.

    , & MMAD: microarray microdissection with analysis of differences is a computational tool for deconvoluting cell type-specific contributions from tissue samples. Bioinformatics 30, 682–689 (2014).

  8. 8.

    , , , & Digital sorting of complex tissues for cell type-specific gene expression profiles. BMC Bioinformatics 14, 89 (2013).

  9. 9.

    , , & A self-directed method for cell-type identification and separation of gene expression microarrays. PLoS Comput. Biol. 9, e1003189 (2013).

  10. 10.

    , , & New support vector algorithms. Neural Comput. 12, 1207–1245 (2000).

  11. 11.

    et al. A global map of human gene expression.. Nat. Biotechnol. 28, 322–324 (2010).

  12. 12.

    et al. Cell type–specific gene expression differences in complex tissues. Nat. Methods 7, 287–289 (2010).

  13. 13.

    , , , & Population-specific expression analysis (PSEA) reveals molecular changes in diseased brain. Nat. Methods 8, 945–947 (2011).

  14. 14.

    et al. Inferring tumour purity and stromal and immune cell admixture from expression data. Nat. Commun. 4, 2612 (2013).

  15. 15.

    & Multicollinearity in regression analysis: the problem revisited. Rev. Econ. Stat. 49, 92–107 (1967).

  16. 16.

    et al. CD40 pathway activation status predicts response to CD40 therapy in diffuse large B cell lymphoma. Sci. Transl. Med. 3, 74ra22 (2011).

  17. 17.

    & DeconRNASeq: a statistical framework for deconvolution of heterogeneous tissue samples based on mRNA-Seq data.. Bioinformatics 29, 1083–1085 (2013).

  18. 18.

    et al. Active idiotypic vaccination versus control immunotherapy for follicular lymphoma. J. Clin. Oncol. 32, 1797–1803 (2014).

  19. 19.

    , & Expression deconvolution: a reinterpretation of DNA microarray data reveals dynamic changes in cell populations. Proc. Natl. Acad. Sci. USA 100, 10370–10375 (2003).

  20. 20.

    & Gene expression deconvolution in linear space. Nat. Methods 9, 8–9 (2012).

  21. 21.

    & An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003).

  22. 22.

    & Practical selection of SVM parameters and noise estimation for SVM regression. Neural Netw. 17, 113–126 (2004).

  23. 23.

    & Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12, 55–67 (1970).

  24. 24.

    , & The Elements of Statistical Learning 2nd edn. (Springer, 2009).

  25. 25.

    , & The doubly regularized support vector machine. Statist. Sinica 16, 589–615 (2006).

  26. 26.

    , , , & in Adv. Neural Inf. Process. Syst. (eds. Mozer, M.C., Jordan, M.I. & Petsche, T.) 9, 155–161 (MIT Press, 1997).

  27. 27.

    et al. Selection of DDX5 as a novel internal control for Q-RT-PCR from microarray data using a block bootstrap re-sampling scheme. BMC Genomics 8, 140 (2007).

  28. 28.

    et al. Gene expression signature of cigarette smoking and its role in lung adenocarcinoma development and survival. PLoS One 3, e1651 (2008).

  29. 29.

    & Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. USA 100, 9440–9445 (2003).

  30. 30.

    et al. Gene enrichment profiles reveal T-cell development, differentiation, and lineage-specific transcription factors including ZBTB25 as a novel NF-AT repressor. Blood 115, 5376–5384 (2010).

  31. 31.

    et al. A HaemAtlas: characterizing gene expression in differentiated human blood cells. Blood 113, e1–e9 (2009).

  32. 32.

    et al. Immune response in silico (IRIS): immune-specific genes identified from a compendium of microarray expression data. Genes Immun. 6, 319–331 (2005).

Download references

Acknowledgements

We are grateful to H. Maecker, M. Davis, R. Levy and the Stanford Human Immune Monitoring Center for assistance with this study. This work was supported by grants from the Doris Duke Charitable Foundation (A.A.A.), the Damon Runyon Cancer Research Foundation (A.A.A.), the B&J Cardan Oncology Research Fund (A.A.A.), the Ludwig Institute for Cancer Research (A.A.A. and M.D.), US National Institutes of Health (NIH) grant U01 CA154969 (A.J.G., W.F., Y.X., C.D.H. and M.D.), NIH grant U19 AI090019, NIH grant PHS NRSA 5T32 CA09302-35 (A.M.N.), US Department of Defense grant W81XWH-12-1-0498 (A.M.N.) and a grant from the Siebel Stem Cell Institute and the Thomas and Stacey Siebel Foundation (A.M.N.).

Author information

Author notes

    • Michael R Green

    Present address: Eppley Institute for Research in Cancer and Allied Diseases, University of Nebraska Medical Center, Omaha, Nebraska, USA.

    • Aaron M Newman
    •  & Chih Long Liu

    These authors contributed equally to this work.

Affiliations

  1. Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, California, USA.

    • Aaron M Newman
    • , Chih Long Liu
    • , Maximilian Diehn
    •  & Ash A Alizadeh
  2. Department of Medicine, Division of Oncology, Stanford Cancer Institute, Stanford University, Stanford, California, USA.

    • Aaron M Newman
    • , Chih Long Liu
    • , Michael R Green
    •  & Ash A Alizadeh
  3. Center for Cancer Systems Biology, Stanford University, Stanford, California, USA.

    • Michael R Green
    • , Andrew J Gentles
    •  & Ash A Alizadeh
  4. Department of Radiology, Stanford University, Stanford, California, USA.

    • Andrew J Gentles
  5. Department of Radiation Oncology, Stanford University, Stanford, California, USA.

    • Weiguo Feng
    •  & Maximilian Diehn
  6. Department of Cardiothoracic Surgery, Division of Thoracic Surgery, Stanford University, Stanford, California, USA.

    • Yue Xu
    •  & Chuong D Hoang
  7. Stanford Cancer Institute, Stanford University, Stanford, California, USA.

    • Maximilian Diehn
    •  & Ash A Alizadeh
  8. Department of Medicine, Division of Hematology, Stanford Cancer Institute, Stanford University, Stanford, California, USA.

    • Ash A Alizadeh

Authors

  1. Search for Aaron M Newman in:

  2. Search for Chih Long Liu in:

  3. Search for Michael R Green in:

  4. Search for Andrew J Gentles in:

  5. Search for Weiguo Feng in:

  6. Search for Yue Xu in:

  7. Search for Chuong D Hoang in:

  8. Search for Maximilian Diehn in:

  9. Search for Ash A Alizadeh in:

Contributions

A.M.N. and A.A.A. conceived of CIBERSORT, developed strategies for related experiments, analyzed the data and wrote the paper. A.M.N. developed and implemented CIBERSORT. C.L.L. implemented web infrastructure and wrote the paper. M.R.G. performed flow cytometry and gene expression profiling of leukocytes from human tonsils and peripheral blood. A.J.G. assisted in the conceptual development of CIBERSORT. W.F., Y.X., C.D.H. and M.D. assisted in the collection and analysis of lung tissue. All authors discussed the results and implications and commented on the manuscript at all stages.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Ash A Alizadeh.

Integrated supplementary information

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–14, Supplementary Note, Supplementary Results and Supplementary Discussion

Excel files

  1. 1.

    Supplementary Table 1

    Leukocyte signature matrix (LM22). Details of LM22, including gene expression matrix and source data.

  2. 2.

    Supplementary Table 2

    Validation of LM22 on external datasets of purified leukocyte subsets. Analysis of external GEP datasets consisting of distinct leukocyte subsets.

  3. 3.

    Supplementary Table 3

    Feature comparison of GEP deconvolution methods analyzed in this work. Table comparing key features of GEP deconvolution approaches.

  4. 4.

    Supplementary Table 4

    Comparative analysis of GEP deconvolution methods. Performance comparison of CIBERSORT, RLR, PERT, LLSR, and QP on both complex and idealized mixtures.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nmeth.3337

Further reading