Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Matters Arising
  • Published:

APP gene copy number changes reflect exogenous contamination

Matters Arising to this article was published on 19 August 2020

The Original Article was published on 21 November 2018

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: APP vector contamination in the Lee study.
Fig. 2: APP cDNA-supporting reads originate from exogenous PCR products and genome-wide human and mouse mRNA contamination.
Fig. 3: Absence of somatic APP retrogene insertions in our scWGS data.

Data availability

APP vector PCR sequences have been deposited in the NCBI SRA (PRJNA577966). Single-cell whole-genome sequencing data from control individuals have been deposited in the NCBI SRA (PRJNA245456) and dbGAP (phs001485.v1.p1). Single-cell whole-genome sequencing data from patients with AD are available upon request for genomic regions of APP and source pseudogene SKA3 and ZNF100.

Code availability

Implemented custom code for the estimation of clipped read fractions and the detection of intra-exon junctions (IEJs) is available at


  1. McConnell, M. J. et al. Intersection of diverse neuronal genomes and neuropsychiatric disease: The Brain Somatic Mosaicism Network. Science 356, eaal1641 (2017).

    Article  Google Scholar 

  2. Lee, M. H. et al. Somatic APP gene recombination in Alzheimer’s disease and normal neurons. Nature 563, 639–645 (2018).

    Article  CAS  ADS  Google Scholar 

  3. Lee, M.-H. et al. Reply: APP gene copy number changes reflect exogenous contamination. Nature (2020).

  4. Park, J. S. et al. Brain somatic mutations observed in Alzheimer’s disease associated with aging and dysregulation of tau phosphorylation. Nat. Commun. 10, 3090 (2019).

    Article  ADS  Google Scholar 

  5. Bushman, D. M. et al. Genomic mosaicism with increased amyloid precursor protein (APP) gene copy number in single neurons from sporadic Alzheimer’s disease brains. eLife 4, (2015).

  6. Kim, J. et al. Vecuum: identification and filtration of false somatic variants caused by recombinant vector contamination. Bioinformatics 32, 3072–3080 (2016).

    Article  CAS  Google Scholar 

  7. Rohrback, S. et al. Submegabase copy number variations arise during cerebral cortical neurogenesis as revealed by single-cell whole-genome sequencing. Proc. Natl Acad. Sci. USA 115, 10804–10809 (2018).

    Article  CAS  Google Scholar 

  8. Cooke, S. L. et al. Processed pseudogenes acquired somatically during cancer development. Nat. Commun. 5, 3644 (2014).

    Article  Google Scholar 

  9. Odelberg, S. J., Weiss, R. B., Hata, A. & White, R. Template-switching during DNA synthesis by Thermus aquaticus DNA polymerase I. Nucleic Acids Res. 23, 2049–2057 (1995).

    Article  CAS  Google Scholar 

  10. Evrony, G. D. et al. Cell lineage analysis in human brain using endogenous retroelements. Neuron 85, 49–59 (2015).

    Article  CAS  Google Scholar 

  11. Lodato, M. A. et al. Aging and neurodegeneration are associated with increased mutations in single human neurons. Science 359, 555–559 (2018).

    Article  CAS  ADS  Google Scholar 

  12. Erwin, J. A. et al. L1-associated genomic regions are deleted in somatic cells of the healthy human brain. Nat. Neurosci. 19, 1583–1591 (2016).

    Article  CAS  Google Scholar 

  13. Evrony, G. D., Lee, E., Park, P. J. & Walsh, C. A. Resolving rates of mutation in the brain using single-neuron genomics. eLife 5, e12966 (2016).

    Article  Google Scholar 

  14. Zhao, B. et al. Somatic LINE-1 retrotransposition in cortical neurons and non-brain tissues of Rett patients and healthy individuals. PLoS Genet. 15, e1008043 (2019).

    Article  CAS  Google Scholar 

  15. Zhang, X. et al. Cell-type-specific alternative splicing governs cell fate in the developing cerebral cortex. Cell 166, 1147–1162.e1115 (2016).

    Article  CAS  Google Scholar 

Download references


E.A.L. is supported by grants from the NIA (K01AG051791), the Suh Kyungbae Foundation, and the Charles H. Hood foundation. This work was also supported by the Paul G. Allen Frontiers Group (C.A.W., E.A.L.), NINDS grant R01NS032457-20S1 (C.A.W.), DOD grant W18XWH2010028 (J.K., E.A.L., C.A.W.), Manton Center Pilot Project Award and Rare Disease Research Fellowship (B.Z.), NIH grants T32HL007627 and K08AG065502 (M.B.M.), and NIH grant AG054748 (M.A.L). C.A.W. is an Investigator of the Howard Hughes Medical Institute.

Author information

Authors and Affiliations



J.K. and E.A.L. conceived and designed the study. J.K. and B.Z. designed the APP vector PCR and sequencing, and B.Z. performed the PCR and sequencing. M.B.M. and M.A.L. performed single-neuron sorting and sequencing. J.K. and A.Y.H. performed bioinformatic analyses. E.A.L and C.A.W supervised the study. J.K., B.Z., M.B.M., M.A.L., C.A.W., and E.A.L. wrote the manuscript.

Corresponding authors

Correspondence to Christopher A. Walsh or Eunjung Alice Lee.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Pervasive recombinant vector contamination in next-generation sequencing.

a, Schematic of a retrogene insertion and the characteristics expected to be captured in sequencing data: increased exonic read-depth, discordant reads spanning exons, clipped reads at exon junctions, 3′ poly-A tail, target site duplication (TSD) at the new genomic insertion site, and clipped reads spanning the retrocopy and insertion sites. b, Recombinant vector contamination found in the Walsh laboratory data. Four single human neurons (1286_PFC_02, 1762_PFC_04, 5379_PFC_01, 5416_PFC_06) in our previous publication showed contamination by a mouse Nin recombinant vector15. The homologous human gene region (NIN) is visualized by the IGV browser for a vector-contaminated cell (top) and an unaffected control cell (bottom). Contamination characteristics were identified, including increased exonic read-depth and exon-spanning discordant reads (reads coloured in red) with numerous mismatches to the human genome reference (coloured vertical bars in the read depth track). c, Mouse single-neuron WGS data from the Chun laboratory7 contaminated by the same APP recombinant vector detected in the Lee study2 and an additional APP plasmid vector (magnified panel).

Extended Data Fig. 2 Evidence that recombinant vector contamination is the major source of APP gencDNA.

a, Schematic of the DNA fragment size distribution for each APP source (source APP, APP retrocopy, APP vector). Fragments from APP vectors are expected to be more homogeneous and smaller than those from other sources owing to the fixed and relatively small vector size. b, DNA fragment (or insert) size estimation. Sequence reads mapped to APP exon junctions were divided into two groups: source APP (reads containing intron sequences) and APP gencDNA (reads clipped at the exon junction) supporting reads. gencDNA supporting reads were remapped to the APP reference transcript sequence (APP-751) to estimate insert sizes. c, Comparison of insert size distribution between source and gencDNA supporting reads. n, number of read pairs in each group; centre line, median; box limits, first and third quartiles; whiskers, 1.5 × interquartile range.

Extended Data Fig. 3 New APP variants with intra-exon junctions as PCR artefacts.

a, Electrophoresis of PCR products from the vector APP inserts (APP-751, APP-695) showing novel APP variants as artefacts. All combinations of two PCR enzymes (FastStart PCR master mix and Platinum SuperFi DNA polymerase; OneStep Ahead RT–PCR in Fig. 1c) with three primer sets generated new bands smaller than the expected PCR product. b, PCR-induced IEJs with homologous sequences at each junction identified by Illumina sequencing. Twelve IEJs from our vector PCR sequencing showed exactly the same sequence homologies and genomic coordinates as IEJs reported by Lee et al2. For two IEJs, IGV browser images show pre- (left) and post-junction sites (right) connected by split reads spanning the IEJ (red arc). Because IGV displays forward strand sequences of the human reference genome, all IEJ sequences were also reverse complemented for consistent visualization.

Supplementary information

Supplementary Information

This file contains Supplementary Methods, and Supplementary Figure 1 and 2.

Reporting Summary

Supplementary Table 1

This table shows identified intra-exon junctions (IEJs) generated by PCR artifacts.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, J., Zhao, B., Huang, A.Y. et al. APP gene copy number changes reflect exogenous contamination. Nature 584, E20–E28 (2020).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research