A potential undesirable side effect of somatic gene therapy is that, during gene delivery, foreign DNA reaches the gonadal tissue and inserts into the germ cell genome, with the risk of transmission to subsequent generations. If exogenous insertions indeed occur, what level of insertion is tolerable? This has been the subject of public debate. The Federal Drug Administration in the United States has proposed that gene delivery strategies should have a limit of haploid genome insertion of less than 50 per μg of DNA, or 1 event in roughly 6,000 sperm. In this context, it is important to consider the extent to which endogenous insertions occur in the human genome. Although the frequency of endogenous insertions has not been determined empirically, it can be estimated based on available data.

The major causative agents of endogenous genomic insertions are LINE-1 (L1) retrotransposons. These elements, of which there are more than 100,000 forms, comprise approximately 15% of the human genome1. Most L1s are truncated or rearranged, and only about 3,000 are full-length. Of these full-length L1s, approximately 40–50 are active retrotransposons2. Retrotransposition occurs in a series of steps: transcription, endonucleolytic nicking of genomic DNA, reverse transcription of L1 RNA and integration of L1 DNA at the endonuclease cleavage site3. The L1 endonuclease has limited sequence specificity3, thus retrotransposition events occur at many sites in the genome, including within genes. A HeLa cell culture assay was used to show that human L1 elements autonomously retrotranspose and to estimate the frequency at which L1s insert into genes4. As there is little, if any, bias against genes as sites of L1 retrotransposition in cultured cells5, L1-mediated insertions in the human genome are likely to occur at essentially random sites. Alu elements and processed pseudogenes insert at genomic sequences that closely resemble the sequences at L1 insertion sites6, suggesting that L1 endonuclease is also responsible for the retrotransposition of these elements.

In which cells do endogenous retrotransposon insertions occur? Studies of L1 expression in mouse gametogenesis indicate that L1 elements are expressed in the leptotene and zygotene stages of primary spermatocyte development, that is, during the meiotic prophase7. While it is therefore likely that endogenous events occur in sperm, it is also possible that such events occur in the fertilized egg or during very early stages of embryonic development.

I now estimate the frequency of retrotransposition events. In humans, 28 retrotransposition events had been reported as of March 1999. Of these, 12 are L1 insertions8,9,10,11, 14 are Alu insertions8 and 2 are insertions of other SINE retrotransposons12,13. The number of independent human mutations reported in the Human Gene Mutation Database (http://www.uwcm.ac.uk/uwcm/mg/hgmd0.html) is 16,650 in 860 genes, therefore 28 of 16,650—or 1 in 600—mutations are estimated to arise from retrotransposon-mediated insertion. Certain factors may make this estimation inaccurate. For example, retrotransposon-insertion mutations may often be overlooked by PCR methods used for mutation analysis. On the other hand, recurrent mutations, which are common in some genes, are counted only once in the database.

The frequency of mutations in the human genome is estimated to be 10-9 per nucleotide per year14,15. With 3×109 nucleotides per haploid genome and 25 years per generation, there are on average 75 mutations per haploid genome per generation, or 75 mutations derived from male germ cells in an individual human. If 1 in every 600 mutations is a retrotransposon insertion, then about 1 individual in every 8 (75×1/600) will carry a new retrotransposition event. Even if overestimated by a factor of 10, 1 endogenous genomic insertion is expected in every 50 to 100 individuals. Most insertions will be harmless, because exons and critical regulatory sequences make up less than 5% of human genomic DNA. Even an endogenous insertion rate of 1 in every 50 to 100 individuals is substantially greater than the rate of 1 event in 6,000 sperm that has been suggested as an upper limit for exogenous insertions in human gene therapy trials. Although the desired number of insertions into the genome from exogenous agents would ideally be zero, regulatory agencies should consider the endogenous frequency of insertion events when setting policy on this issue.