BIOINFORMATICS

Contrasting PCA across datasets

Abid, A. et al. Nat. Commun. 9, 2134 (2018).

Principal component analysis (PCA) is a popular method for transforming high-dimensional data into a smaller set of orthogonal variables or components that capture most of the variation in the full dataset. PCA is often applied to multiple datasets, and the resulting two- or three-dimensional plots are visually compared to assess differences. Abid et al. have now developed contrastive PCA (cPCA), an unsupervised method that provides systematic rather than subjective comparisons. cPCA finds subspaces that capture a large amount of variation in one dataset but little variation in a background dataset used for comparison. With careful selection of control data, the approach allows researchers to look for the enriched components with the greatest biological relevance. The authors demonstrate cPCA as a contrastive tool to discriminate patterns in protein expression, single-cell RNA-seq and genetic polymorphism data.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Tal Nawy.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Nawy, T. Contrasting PCA across datasets. Nat Methods 15, 572 (2018). https://doi.org/10.1038/s41592-018-0093-0

Download citation

Further reading

Search

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing