Computational correction of index switching in multiplexed sequencing libraries

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Computational correction of index switching.

References

  1. 1

    Segerstolpe, Å. et al. Cell Metab. 24, 593–607 (2016).

    CAS  Article  Google Scholar 

  2. 2

    Cao, J. et al. Science 357, 661–667 (2017).

    CAS  Article  Google Scholar 

  3. 3

    Illumina Inc. Effects of Index Misassignment on Multiplexing and Downstream Analysis. Publication No. 770-2017-004-C QB # 5420 (Illumina Inc., 2017).

  4. 4

    Sinha, R. et al. bioRxiv preprint at https://www.biorxiv.org/content/early/2017/04/09/125724 (2017).

    Google Scholar 

  5. 5

    Griffiths, J.A., Richard, A.C., Bach, K., Lun, A.T.L. & Marioni, J.C. bioRxiv preprint at https://www.biorxiv.org/content/early/2018/01/30/177048 (2018).

    Google Scholar 

  6. 6

    Costello, M. et al. bioRxiv preprint at https://www.biorxiv.org/content/early/2017/10/10/200790 (2017).

    Google Scholar 

Download references

Acknowledgements

This work was supported by the Swedish Research Council (grant 2017-01062 to R. Sandberg), the Bert L. and N. Kuggie Vallee Foundation (to R. Sandberg), a National Science Foundation Graduate Research Fellowship (grant DGE 1147470 to G.S.), the NIH (grant R01CA86065 to I.L.W.), the California Institute for Regenerative Medicine (grant GC1R-06673-A), to M. Snyder; Collaborative Reserch Project to I.L.W. and the Virginia and D.K. Ludwig Fund for Cancer Research (to I.L.W.)

Author information

Affiliations

Authors

Corresponding author

Correspondence to Rickard Sandberg.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Read count distributions at different thresholds of filtering.

(A) Histogram of gene read count distributions (post-correction HiSeq4000) for all 384 cells of the HSC plate, after applying read count filtering at the indicated threshold (1 to 10 reads). (B) The read count distribution of the same plate (sequenced on NextSeq) without correction or read count filtering. We noted that corrected data still had an excess of low read counts which we eliminated by using a read-count threshold. Note that using a read count threshold is not sufficient to remove cross-contamination signal (Fig. S2).

Supplementary Figure 2 Evaluation of read count threshold for index switching correction.

(A) Histogram of false positive gene expression, defined as gene expression detected on the HiSeq 4000 before (blue) and after (yellow) correction where no expression was detected in the same cell on the NextSeq 500. (B) Histogram with fraction false positive expression signals removed per gene. (C) Histogram of false negative gene expression defined as no detectable gene expression on the HiSeq 4000 before (blue) and after (yellow) correction where expression was detected in the same cell on the NextSeq 500.

Supplementary Figure 3 Corrected data rescued index-driven HSC sub-clusters.

(A) Robust PCA (rPCA) analyses of HSCs based on HiSeq 4000 sequence counts, coloured by the Nextera i5 index. The rPCA algorithm assigns the most prominent outlier observations (i.e. cells) into the last principal components, which corresponded to wells that had been amplified using the same i7 and i5 indices (n = 384 cells). (B) Loadings of cells on PC24, stratified by i5 Nextera primers (n = 384 cells, 16 stratifications) (C) Loading of cells on P25, stratified by i7 Nextera primers (n = 384 cells, 24 stratifications). Center: Median, Hinges: 1st and 3rd quartiles, Whiskers: 1.5 interquartile range (IQR) (D) rPCA clustering of HSCs (as in A) for corrected HiSeq 4000 sequence counts, colored by Nextera i5 index primer. (E-F) As in (C-D) for corrected HiSeq 4000 sequence counts.

Supplementary Figure 4 Corrected data rescued index-driven plate-associated clustering.

(A) PCA analysis of HSCs from two plates of libraries sequenced on the HiSeq 4000, colored by library plate. PC2 scores separate cells by plates (n = 384 cells per plate). (B) As in (A) for corrected transcriptome data.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–4 and Supplementary Methods (PDF 938 kb)

Life Sciences Reporting Summary (PDF 67 kb)

Supplementary Software

unspread software for index switching correction (ZIP 6 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Larsson, A., Stanley, G., Sinha, R. et al. Computational correction of index switching in multiplexed sequencing libraries. Nat Methods 15, 305–307 (2018). https://doi.org/10.1038/nmeth.4666

Download citation

Further reading