Fig. 1: Illustration of ‘panoramic’ dataset integration.

From: Efficient integration of heterogeneous single-cell transcriptomes using Scanorama

a, A panorama stitching algorithm finds and merges overlapping images to create a larger, combined image. b, A similar strategy can also be used to merge heterogeneous scRNA-seq datasets. Scanorama searches nearest neighbors to identify shared cell types among all pairs of datasets. Dimensionality reduction techniques and an approximate nearest-neighbors algorithm based on hyperplane locality sensitive hashing and random projection trees greatly accelerates the search step. Mutually linked cells form matches that can be leveraged to correct for batch effects and merge experiments together (Methods), whereby the datasets forming connected components on the basis of these matches become a scRNA-seq ‘panorama’.

