Cleary, B. et al. Cell 171, 1424–1436.e18 (2017).

Big data comes at a price, and it is a glutton for computer memory. Rather than using data compression to solve this problem, Cleary et al. have come up with ways to generate roughly equivalent genomic data from a smaller set of gene expression measurements. The authors borrow an approach called compressed sensing from the field of signal processing, in which random composite measurements are used to infer the activity of gene modules. The expression of unmeasured genes can then be estimated from the inferred module activities. Their blind compressed sensing with sparse module activity factorization (BCS-SMAF) software can successfully recover the transcriptome from 100 composite measurements without requiring training data. The approach may be useful for screens, single-cell profiling and other large-scale expression studies.