Research Article

Immunology and Cell Biology (1998) 76, 395–405; doi:10.1046/j.1440-1711.1998.00772.x

PCR amplification of murine immunoglobulin germline V genes: Strategies for minimization of recombination artefacts

Paula Zylstra1, Harald S Rothenfluh2, Georg F Weiller3, Robert V Blanden2 and Edward J Steele1,2

  1. 1 Molecular Immunology Laboratory, Department of Biological Sciences, University of Wollongong, Wollongong, New South Wales
  2. 2 Division of Immunology and Cell Biology, John Curtin School of Medical Research
  3. 3 Bioinformatics Laboratory, Research School of Biological Sciences, Australian National University, Canberra, Australian Capital Territory, Australia

Correspondence: EJ Steele, Department of Biological Sciences, University of Wollongong, Northfields Avenue, Wollongong, NSW 2522, Australia. Email: <Ted_Steele@uow.edu.au>

Received 14 April 1998; Accepted 5 June 1998.

Top

Abstract

Murine immunoglobulin germline V genes exist as multiple sequences arranged in tandem in germline DNA. Because members of V gene families are very similar, they can be amplified simultaneously using the polymerase chain reaction (PCR) with a single set of primers designed over regions of sequence similarity. In the present paper, the variables relevant to production of artefacts by recombination between different germline sequences during amplification are investigated. Pfu or Taq DNA polymerases were used to amplify from various DNA template mixtures with varying numbers of amplification cycles. Pfu generated a higher percentage of recombination artefacts than Taq. The number of artefacts and their complexity increased with the number of amplification cycles, becoming a high proportion of the total number of PCR products once the `plateau phase' of the reaction was reached. Recombination events were located throughout the approx 1-kb product, with no preferred sites of cross-over. By using the minimally detectable PCR bands (produced by the minimum number of amplification cycles), recombination artefacts can be virtually eliminated from PCR amplifications involving mixtures of very similar sequences. This information is relevant to all studies involving PCR amplification of members of highly homologous multigene families of cellular or viral origin.

Keywords:

artefact, immunoglobulin variable gene, PCR, recombination

Top

Introduction

Our objective is to evaluate possible mechanisms responsible for the evolution and maintenance of Ig variable (V) region genes, which exist in large numbers in germline DNA. This involves the analysis of murine germline IgV sequences, in particular the J558 family of IgV heavy (H) chain sequences,1, 2, 3 and the mechanism of somatic hypermutation of rearranged V genes.4, 5 The advent of the polymerase chain reaction (PCR), allowing simultaneous amplification of a group of related sequences using a single set of PCR primers designed over regions of DNA sequence similarity,6 has allowed rapid amplification, cloning and sequencing of large numbers of V genes. V gene sequences are categorized into distinct families, with members having high sequence similarity (greater than or equal to 75%). Therefore, absolute accuracy of sequence determination is needed to distinguish between closely related genes. To this end, PCR amplifications were initially carried out using Pfu DNA polymerase, with an enzyme-generated error rate of less than or equal to 3.3 times 10–6 errors per nucleotide per cycle,7 approx 12 times lower than the reported error rate for Taq DNA polymerase of between 2 times 10–4 and 4.1 times 10–5 errors per nucleotide per cycle (operationally this is approx 1 substitution per 1000-b.p. fragment and approx 1 substitution per 10 000 b.p. per amplified and sequenced product for Taq DNA polymerase and Pfu DNA polymerase, respectively).4, 6 Reports that hybrid molecules may be produced during PCR amplification from damaged DNA templates,8 from rearranged IgV genes in B cells9 or from PCR reactions with a high DNA template:primer ratio10 prompted investigation of whether, under the conditions employed in the Molecular Immunology Laboratory, Department of Biological Sciences, University of Wollongong, we could detect PCR-recombinants.2, 4, 7 These initial investigations detected no such recombinant artefacts.2, 7

Recent equipment changes have produced suspect sequences, suggestive of chimeric PCR products. Suspicious unique sequences ('singletons'), were identified in which all sections of the sequence could be attributed to other genuine germline sequences (isolated from multiple, independent PCR amplifications3). Whether such `singletons' are genuine genes or PCR artefacts is impossible to determine, because among genuine germline sequences similar types of mosaic patterns can be seen, indicative of recombination with other subfamily members over evolutionary time.3 Therefore, a sequence can only be considered genuinely present in genomic DNA if it is isolated more than once (from independent PCR amplifications). Therefore, minimization of PCR artefacts was essential in our attempt to calculate the numbers of V genes present in the J558 family and further analyse the DNA sequence structure of germline genes to evaluate mechanisms responsible for their evolution and maintenance.3

Once suspect sequences had been identifed from PCR-amplified genomic DNA, control reactions were included in every PCR amplification to detect possible PCR recombination. The controls showed that PCR recombination was indeed occurring. Therefore, comprehensive investigation of the variables affecting PCR recombination rates was undertaken. The results of this investigation, reported here, have led to a PCR amplification protocol in which recombination artefacts are undetectable, thus maximizing IgV gene sequence accuracy and efficiency of isolation of new genuine genes. This information is potentially relevant to all work involving PCR amplification of members of multigene families (e.g. in the immune and olfactory systems) or highly variable viral sequences.

Top

Materials and Methods

Genomic DNA

High molecular weight genomic DNA was isolated from whole mouse embryos. After 17 days of gestation, 8–12-week-old C57BL/6J and BALB/c females were killed, the embryos removed and immediately rinsed three times in ice-cold phosphate-buffered saline. One–three embryos were placed in a sterile mortar containing a small amount of sterile sand and ground in detergent solution. The DNA was subsequently extracted with phenol:chloroform:isoamyl alcohol and chloroform. The RNA was removed by RNase treatment. The DNA was precipitated, spooled and resuspended in 1 mL H2O. The presence of high molecular weight DNA was confirmed by agarose gel electrophoresis.

Cloned DNA sequences

All template sequences were genuine germline genes PCR-amplified from BALB/c or C57BL/6J genomic DNA, using Pfu DNA polymerase. DNA for making the probes was PCR-amplified using Taq DNA polymerase (Perkin Elmer, Branchburg, NJ, USA) from a previously cloned, rearranged, unmutated VH186.2 (40.3) or rearranged VH205.12-related (A20/44) genomic sequences as used previously.5, 7

PCR amplification of V gene sequences

The PCR amplifications from cloned (Mixes 1–10, Table 1) and genomic DNA templates were carried out in 0.2-mL thin-walled reaction vessels in the Omn-E thermal cycler (Hybaid Ltd, Middlesex, UK) and Perkin Elmer 9600, or in 0.5-mL microfuge tubes (Sarstedt, Germany) in the Hybaid HBTR1 (Hybaid Ltd). Preliminary PCR amplifications from approximately equimolar mixtures of four cloned VH186.2-related templates (100 pg plasmid DNA, Mixes A and B, Table 1), run as control reactions concurrently with PCR amplifications of genomic DNA utilized the Hybaid HBTR1, Perkin Elmer GeneAmp PCR System 9600 or Hybaid Omn-E thermal cyclers as indicated (Table 2). The 25-muL reaction mixtures used routinely in the Molecular Immunology Laboratory, University of Wollongong (not involving hot-starts) to obtain readily visible PCR products after 35–40 amplification cycles contained 2 pmol of each primer (Figure 1), 0.2 mmol/L dNTP, 2 mmol/L MgCl2 and manufacturer-supplied reaction buffer at 1 times concentration. Pfu DNA polymerase (1.25 units; Stratagene, La Jolla, CA, USA) or 2.5 units of Taq DNA polymerase (Perkin Elmer) was added after the initial denaturation step. Genomic PCR amplifications were carried out from 100 ng of genomic DNA. Further PCR amplifications from mixtures of cloned DNA templates (n greater than or equal to 4 different templates) and positive control PCR amplifications utilized 400 pg of DNA (Table 3). A negative control (no DNA) was also included. For PCR amplification of VH186.2-related sequences, the following primers were used (Figure 1a):

Figure 1.
Figure 1 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Gene diagram for (a) VH186.2 and (b) VH205.12 polymerase chain reaction (PCR)-amplified sequences. The 5' PCR primers (5'CON for VH186.2 and 5'205 for VH205.12) are located approx 500 b.p. 5' of the putative transcription start site (capupwards arrow with tip rightwards), with the common 3' PCR primer (VPR) located within the 3' end of the V coding region. L, leader; P, promoter. The Universal Reverse and sequence-specific internal sequencing primers S1 (VH186.2) and 205MID (VH205.12) were used to sequence one strand of the cloned PCR products. Sequence reactions routinely produced 550–750 b.p. of unambiguous sequence.

Full figure and legend (10K)




5'CON (5'-GCGGTCGACGTGATGCAATATTCTGTTGAC-3') and VPR (5'-GCGAATTCAGAGTCCTCAGATGTCAGGCT-3'). The VPR primer was also used for the amplification of VH205.12-related sequences, but a different upstream flanking sequence-specific primer was used (Figure 1b):

5'205 (5' GCGGTCGACTCAGTATGTGACAGTGACTAG-3'). The upstream primers (5'CON, 5'205) contained a SalI restriction site, whereas the downstream primer (VPR) contained an EcoRI site to facilitate directional cloning into the pUC19 cloning vector. After a 5-min denaturation step at 95 °C, amplification was allowed to proceed for a specified number of cycles (varying for different experiments) at 94 °C (30 s), 55 °C (30 s) and 72 °C (Taq DNA polymerase) or 75 °C (Pfu DNA polymerase) (3 min). All PCR reagents were aliquoted using filter tips in a laminar flow hood located in a part of the building never used for other pre- or post-PCR DNA manipulations.

Cloning and sequencing of PCR products

The amplification products were electrophoresed on 0.7% agarose gel (60 V, 90 min, in BioRad Wide mini-sub cell; BioRad Laboratories, Hercules, CA, USA) with one empty lane between the samples to prevent cross-contaminations while excising the bands. The electrophoresed DNA was removed from the agarose gel by placing the gel slice into a filter unit inside a microfuge tube (0.2 mum UltrafreeMC, Millipore Corp., Bedford, MA, USA) followed by centrifugation at 15 800 g for 5 min in a bench centrifuge (Eppendorf, Model no. 5415C; Eppendorf-Netheler-Hinz GmbH, Hamburg, Germany). The buffer collected at the bottom of the microfuge tube contained most (>80%) of the DNA, while the agarose gel with traces of residual DNA remained inside the filter unit. The DNA was then purified by isopropanol/0.8 mol/L LiCl precipitation at –20 °C for at least 60 min, pelleted as described at 15 800 g but for 30 min, washed with 70% ethanol at 15 800 g for 5 min, and the pellet dissolved in 17 muL reagent-grade H2O and kept at 4 °C until use (no more than 24 h).

Each eluted PCR product was restricted with EcoRI and with SalI, and then ligated into 50 ng of pUC19 DNA that was also restricted with EcoRI and SalI. Ligations were carried out in 20 muL reaction volumes and placed in an ice/water bath (initial temperature 10 °C) overnight. Following ligation, the sample volume was brought to 100 muL with water, and 2 muL of this was used for each transformation (1 ng). The remainder was stored at 4 °C. The JM109 strain of Escherichia coli was made competent with CaCl2 following standard methods.11 Approximately 2 times 108 cells were used for each transformation. This resulted in a plasmid to E. coli cell ratio of less than or equal to 1. The ligated DNA and the competent cells were held on ice for 60 min, heat shocked at 37 °C for 5 min, then cooled on ice. After a 1 h incubation in 1 mL Luria broth (LB) (37 °C), the transformation mixture was plated over two minimal agar plates containing ampicillin (100 mug/mL), X-Gal and isopropyl thiogalactose (IPTG; 0.6 mug/mL each, all reagents from Progen Pty Ltd, Darra, Qld, Australia). Recombinant white colonies were screened by colony hybridization using probes generated with the Ready-to-go labelling system (AMRAD-Pharmacia Pty Ltd, Melbourne, Vic., Australia), and the reaction allowed to continue in a 37 °C water bath overnight.

Positive colonies were grown overnight in 3 mL Luria broth containing ampicillin (100 mug/mL). The DNA was extracted using the Qiawell+ plasmid mini (for sequencing) or midi (for PCR) prep kit (QIAGEN, Hilden, Germany), or the Wizard plasmid spin prep kit (Promega, Madison, WI, USA) as instructed by the manufacturers. The DNA was eluted in 150 muL 100 mmol/L Tris-HCl pH 8.5 or 100 muL sterile reagent-grade H20, respectively, heated at 70 °C for 10 min before storage at –20 °C. The insert was excised from 5 muL of DNA with Eco RI and SalI and electrophoresed on a 1.5% agarose gel to confirm correct insert size (60 V, 90 min, BioRad wide mini sub cell; BioRad Laboratories). Clones with the correct size insert (which includes primers) of 1018 b.p. (VH186.2) or 1008 b.p. (VH205.12) were sequenced.

Fluorescent automated DNA sequencing

Sequencing reactions were performed in 0.2 mL thin-walled reaction vessels in the Perkin Elmer GeneAmp PCR System 9600 thermal cycler. The positions of the sequencing primers for both strands of the VH186.2-and VH205.12-related clones are shown in Figure 1. Sequencing to identify genuine germline genes usually covers both strands of the cloned DNA insert to ensure accuracy.3 However, because the initial template sequences were known in this case, sequencing of PCR clones from template mixtures of genuine germline genes covered a single strand, using two to three primers to obtain the necessary sequence. Half-volume reactions were used, with 4 muL of ABI Prism sequencing buffer, 3.2 pmol primer and 2.8 muL DNA (250–400 ng). The reactions were cycled and purified following the manufacturer's instructions. Pellets were washed with 300 muL 70% ethanol and spun again at 15 800 g for 5 min, before being dried (Savant Speed-Vac Concentrator; Savant Instruments Inc., Farmingdale, NY, USA). Pellets were resuspended in 5.7 muL of a 5:1 v/v solution of EDTA-dextran and formamide, and electrophoresed on a 5.2% Page+ gel (Amresco, Solon, OH, USA) on an ABI 377 automated DNA sequencer (Perkin Elmer). These conditions allowed unambiguous base determination for 550–750 b.p. of DNA sequence. Sequence fragments were assembled with AssemblyLign 1.0 (Eastman Kodak Co., Rochester, NY, USA) and aligned with the MegAlign package (DNASTAR Inc., Abacus House, West Ealing, London, UK).

Visual and quantitative determination of band intensity

The PCR amplifications were carried out using Pfu DNA polymerase or Taq DNA polymerase, from BALB/c genomic DNA or from Mix3 (Table 1). Reactions were removed from the thermal cycler in 1- and 2-cycle increments, over the exponential phase of the amplification, and then every five cycles. The PCR amplifications were duplicated on the Hybaid HBTR1 and Omn-E thermal cyclers. For visual estimation, the duplicate PCR products from both machines were electrophoresed on an agarose gel (see previous section) and band intensity determined visually. For accurate quantification, repeats of these reactions were carried out. Each set of reactions was run on a gel separately (10% of reaction volume) alongside standard amounts of (EcoRI and SalI restricted CHS1.3) cloned DNA (VH186.2). The DNA was transferred to a Hybond+ Nylon membrane12 and hybridized (1.25 mmol/L EDTA, 525 mmol/L Na2HPO4, 7% SDS) with the probe (Ready-to-go; Pharmacia) overnight at 55 °C. The membrane was washed in 20–40 mL of Wash Buffer A (1 mol/L EDTA, 50 mmol/L Na2HPO4, 5% SDS) for 10 min at 25 °C, and 20–40 mL of Wash Buffer B (1 mmol/L EDTA, 100 mmol/L Na2HPO4, 1% SDS) for 10 min at 55 °C. Membranes were exposed to phosphor imaging screens and images scanned using the Storm 840 Trimode Analysis System and ImageQuant 1.1 software (both from Molecular Dynamics Pty Ltd, Box Hill, Vic., Australia). Scanned images were analysed (Molecular Analyst 2.1, Bio-Rad Laboratories) to determine band intensity.

Top

Results

Varying the thermal cycler in preliminary experiments

Once suspect sequences had been identifed from PCR-amplified genomic DNA, two control reactions were included in every PCR amplification to detect possible recombination. Each control tube contained four different cloned VH186.2-related germline sequences (Table 1, Mixes A and B), each at 25 pg (total of 100 pg in 25 muL), and a number of hybridization-positive clones from each reaction were sequenced. In these preliminary experiments, the rate of detectable PCR recombination appeared to vary with the type of thermal cycler used (Table 2). The Hybaid Omn-E showed no PCR recombination in these control reactions. Therefore, this machine was used for all subsequent PCR amplifications. However, further suspect sequences were isolated from genomic amplifications using the Hybaid Omn-E (data not shown). These data indicated the need for comprehensive investigation of variables relevant to production of PCR recombination artefacts using multitemplate mixtures.

Varying the number and sequence diversity of the template mixture

Two independent PCR amplifications were carried out with the Hybaid Omn-E on each of mixtures 1–10 of cloned DNA templates (Table 1), with up to 30 hybridization-positive clones being sequenced from each PCR run. The PCR recombination was detected in every amplification (Table 3), with the apparent rate being higher with larger numbers of templates per mixture and with equimolar mixtures (in comparison with non-equimolar mixtures). In the case of equimolar mixtures, the only difference between mixtures was the number of templates and the degree of similarity between different individual templates. Theoretically, the number of actual recombination events should be the same for all mixtures. However, recombination between identical templates will be undetectable. This will happen more frequently in mixtures with small numbers of templates than with large numbers, and more frequently in non-equimolar than equimolar mixtures. Therefore, the number of detectable recombination events is a reflection of how early in the amplification run recombination between non-identical templates took place. In effect, each such event adds a new template to the mixture which becomes eligible for amplification and participation in further detectable recombination. The data in Table 3 clearly indicate the unpredictability of these chance events in different amplification runs involving the same or different mixtures.

The percentage of recombinant sequences seen in Table 3 is much higher than expected from the low percentage of suspicious singleton gene isolations from PCR using genomic DNA that were previously carried out in the Molecular Immunology Laboratory at the University of Wollongong (for example, using the Hybaid Omni-E and Pfu to amplify VH186.2-related genes from C57BL6 J genomic DNA (100 ng), no PCR-recombinants were detected after 40 cycles among 28 clones, HS Rothenfluh unpubl. data). This difference might be due to the expected lower number of starting templates in genomic DNA (Figure 3), or due to the necessary use of a new batch of Pfu DNA polymerase (batch 0753025 for previous genomic DNA PCR amplifications; batch 0664033 for Table 3). However, the rates of PCR recombination generated by both Pfu batches off the same set of diverse templates (Mix3) using the Hybaid Omn-E is very similar (Table 3, Table 4), indicating that there really was no demonstrable difference between the two Pfu batches with respect to the generated incidence of PCR-recombinants.

Figure 3.
Figure 3 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

DNA concentration versus cycle number. The efficiency of the polymerase chain reaction (PCR) amplification, from cloned (Mix3) or genomic (BALB/c) DNA as template, was quantified for reactions using (a) Pfu or (b) Taq DNA polymerases. Each experiment was performed once only because the aim of these experiments was not to determine the DNA concentration of reactions as such, but to determine the relative efficiency of each thermal cycler and each DNA polymerase, and to determine the relative degradation of PCR product in each case. These results agreed in all experiments where hybridizations were carried out (eight experiments, Figure 3), and where results were judged by eye from agarose gel electrophoreses (12 experiments, data not shown). (square), Mix3 Omn-E; (diamond), Mix3 HBTR1; (circle), BALB/c Omn-E; (triangle), BALB/c HBTR1.

Full figure and legend (13K)


Varying the DNA polymerase

A further set of experiments was set up to investigate differences between Pfu and Taq. The PCR amplifications using Mix3 (Table 1) were carried out comparing Pfu with Taq, using 20 and 40 cycles, respectively (Table 4). In these experiments, amplifications using Taq showed significantly lower rates of PCR recombination, as well as lower numbers of complex recombinants than with Pfu (Table 4). In the present study only one batch of Taq was used, but given the ease with which this batch generated PCR-products of readily visible band intensity (on agarose gels), we have no reason to suspect that it was performing abnormally.

Varying the number of amplification cycles

The PCR amplifications from Mix3 were removed from the thermal cycler at different times (intervals of five amplification cycles from 15 to 40 cycles). Bands from 20 amplification cycles were cloned and sequenced, and showed a marked decrease in PCR recombination for both Pfu and Taq DNA polymerases in comparison with 40 cycles (Table 4). The brightness of the PCR bands (as judged by eye) from the agarose gel electrophoresis, was similar from 40 cycles to 20 cycles, thereafter decreasing in intensity (data not shown). In order to minimize PCR recombination, the minimum number of cycles required to produce a band suitable for cloning was determined. The PCR amplifications of Mix3 were carried out, with reactions increasing by one cycle from 10 to 20 cycles. The minimum clonable band was found to be after 14 cycles of amplification using Taq, and 10 cycles using Pfu (Table 4). No recombinant molecules were detected at this level using Pfu, in 29 clones sequenced. One recombinant molecule was detected from 30 clones sequenced using Taq at 14 cycles of amplification; this was a single recombination event within 7 b.p. of the 5' end of the amplified molecule. These results therefore show a positive correlation between increasing cycle number (DNA product concentration) and the number and complexity of recombinant molecules. Reducing cycle number to the minimum required for cloning, virtually eliminated recombination artefacts.

The efficiency of the PCR amplification

The main point of the present paper has been to use artificial mixtures of templates to establish some of the main conditions under which PCR recombination may occur or not. When using a mixture of cloned DNA sequences as a control for possible PCR recombination, it would obviously be preferable to have conditions as close as possible to the conditions for genomic DNA amplifications. However, 400 pg of cloned DNA as template was not used because it approximated the number of DNA targets present in the genomic (100 ng template) DNA amplifications, but because visible PCR bands could not be detected with a lower concentration of cloned DNA targets at the time these experiments were initiated. Thus to determine the efficiency of amplification from genomic DNA versus cloned DNA and to what extent the various thermal cyclers which we have used affected the efficiency of our PCR reactions, and therefore the incidence of PCR recombination, further PCR amplifications were carried out on cloned DNA (either a single template CHS1.3, or Mix3) and genomic DNA (C57BL/6J and BALB/c). These were duplicated using the new, more efficient, Hybaid Omn-E in comparison with the older Hybaid HBTR1 thermal cycler. The Hybaid Omn-E uses a solid heating block, whereas the HBTR1 uses a plate moulded from thin metal sheet. This, and the use of 0.2-mL thin-walled tubes plus the hot lid in the Omn-E, produce a closer fit between tube and block than in the case of the HBTR1. The Omn-E gives a read-out on completion of the reaction which identifies the maximum and minimum temperatures reached during the amplification. Amplification used either Pfu or Taq DNA polymerases. Band intensity was judged by visual inspection of agarose gel electrophoreses of 12 different PCR amplifications. Figure 2 shows an example using Taq to amplify CHS1.3 DNA for between 15 and 40 cycles. The last visible band occurred at 19 cycles for the HBTR1 and 17 cycles for the Omn-E. Therefore, for these experiments the latter is more efficient.

Figure 2.
Figure 2 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Gel electrophoresis of polymerase chain reaction (PCR) products from amplifications run simulta-neously on the Hybaid HBTR1 and Hybaid Omn-E thermal cyclers. Taq DNA polymerase was used to amplify from the cloned CHS1.3 (VH186.2) sequence (400 pg). The PCR reactions were removed at 15, 17, 19, 20, 25, 30, 35 and 40 amplification cycles.

Full figure and legend (14K)

The PCR amplifications from Mix3 and BALB/c DNA were further quantified by Southern blot and comparison with standard quantities of cloned DNA. With Taq, the Omn-E was more efficient than the HBTR1, with maximal DNA concentration (plateau phase) being achieved at an earlier cycle number (Figure 3). This efficiency gain was not evident with Pfu. Once the maximal DNA concentration was reached in reactions using the Omn-E with either Pfu or Taq, the DNA concentration of the correct-sized band decreased, with a corresponding increase in the background smear of DNA. Southern blots of these agarose gels showed that the smear included specific DNA sequence which hybridized with the VH186.2 probe. In contrast, the DNA concentrations in reactions run on the HBTR1 rarely passed the plateau phase after 40 cycles. By estimating the band intensity from agarose gels it appeared that the specific PCR product decreased in later cycles only in reactions using Pfu. However, Figure 3 shows that in reactions using Taq a decrease in specific product occurred towards the end of the 40-cycle reaction, although the decrease is less marked than with Pfu. It has been shown that excessive cycling can produce higher molecular weight fragments, once the primer in the reaction is depleted, with the 3' OH ends of the specific PCR product annealing to either the genomic template or itself and being extended.13 This could explain the smear of higher molecular weight fragments present in PCR amplifications using both Taq and Pfu DNA polymerase. However, the smears produced in PCR amplifications using Pfu DNA polymerase also included fragments shorter than the specific PCR product, which might be produced by degradation of specific product by Pfu DNA polymerase. Comparison of the DNA concentration of PCR reactions by amplification of Pfu and Taq DNA polymerases shows that the use of Taq DNA polymerase results in more PCR product than Pfu DNA polymerase (Figure 3, compare (a) to (b)). This may be explained because Pfu DNA polymerase, with its proofreading activity, has a five-fold lower rate of nucleotide incorporation than Taq DNA polymerase.14 The discrepancy between the final DNA concentration for each enzyme and the minimum number of cycles necessary for a clonable band may be explained by slight variations in the efficiency of the cloning of individual PCR products.

The other conclusion from this experiment is that there is both a significantly longer lag time for the appearance of PCR products (Pfu or Taq) and a lower plateau when the amplification proceeds from genomic DNA templates, with little difference between the two thermocyclers. This result suggests that the rates of PCR recombination recorded using artificial mixtures of cloned templates are likely to be higher than what might be generated during PCR amplification of related germline IgV genes from genomic DNA (a conclusion which is consistent with observations, see above).

Complexity of recombinant molecules

It was found that the complexity of the recombinant molecules increased with cycle number (Table 4). Many artefacts from 40 cycles of amplification contained two or more detectable recombination events, with some extensive mosaics of five or more donor sequences (Figure 4), whereas artefacts at 20 cycles contained only one or two events. Clones produced using Pfu contained more multiple recombinants than those produced by Taq. For experiments at 40 cycles the location where recombination events occurred was recorded (termed the 'recombination window', defined in Table 3), with the midpoint of this 'window' being plotted (Figure 5).

Figure 4.
Figure 4 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

One of the recombinant sequences (SJP1.17) produced by polymerase chain reaction (PCR) amplification from an equimolar mix of VH186.2-related C57BL/6J sequences (Mix1), using Pfu DNA polymerase and 40 amplification cycles. Six different donor sequences (CHS1.22, CHS1.9, CHS1.15, CHS1.3, CHS1.17 and CHS1.11) were incorporated into the PCR-amplified fragment.

Full figure and legend (27K)

Figure 5.
Figure 5 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Locations of polymerase chain reaction (PCR) recombination events along the lengths of the PCR-amplified fragments. The stretch of sequence in which each recombination event occurred was determined and the midpoint of this value plotted in 20-b.p. intervals. The PCR used (a) Pfu DNA polymerase (batch 0753025) or (b) Taq DNA polymerase to amplify from a mixture of (cloned) VH186.2-related sequences from BALB/c germline DNA (Mix3), at 40 cycles of amplification. A total of 57 sequences (a) or 60 sequences (b) were combined from two independent PCR runs.

Full figure and legend (16K)

Distribution of recombination sites

Recombination events are distributed essentially randomly, for both Pfu and Taq DNA polymerases (Figure 5). For PCR products generated by Pfu DNA polymerase there is a slight peak at the 5' end of the sequence between nucleotide positions 20 and 40. Indeed there are some other apparent peaks scattered throughout the amplified region that may also be other sites of preferred recombination, although to validate this would require further statistical sampling on a larger scale. The general impression from this data set is that sites of recombination occur at random.

Uniqueness of recombinant molecules

Because sites of recombination were randomly distributed, the probability of producing two identical recombinant molecules from independent PCR amplifications is extremely low. In our experiments, one such case was observed from a total of 98 sequences. Although this figure increases to 1 in 57 sequences when complex artefacts are removed from the calculation (less complex artefacts are more likely to be produced at low cycle numbers), the lower PCR recombination rate seen at lower cycles means it is still unlikely that identical recombination artefacts could be independently produced. The two recombinant sequences that were found to be identical were produced by a single recombination event, with the sequence stretch over which the event could have occurred spanning approx 30 b.p., 125 b.p. from the 5' end of the 958-b.p. fragment. With this result it can be reasonably assumed that a PCR-amplified sequence isolated multiple times (from independent reactions) is genuine. Employing this safeguard with all experimental data sets also ensures that sequences with PCR-induced base substitution errors are not included in analyses.3

Base substitution errors of Taq and Pfu DNA polymerases

The rate of substitution errors for Taq (analysing amplifications from Mix3 at 20 cycles) was in close agreement with other reports,4, 6 with an operational rate of one error per 959 plusminus 32 b.p. (from two PCR). In this set of experiments, the rate of substitution errors for Pfu varied, ranging from an unusual one error per 984 b.p. to a much lower rate of one error per 12 454 b.p., the lower rate being the norm. The majority of base substitutions produced using Taq were &Tgr; right arrow C (18 out of a total of 59), followed by &Agr; right arrow G and T right arrow A (Table 5). The majority of base substitutions produced using Pfu DNA polymerase were C right arrow A (10 out of 16), all from a single PCR.


Top

Discussion

Any work involving the PCR amplification from genomic DNA using primers which can hybridize with a number of different though very similar genes runs the very high risk of producing recombination artefacts. In the present study our objective was to define in detail the variables affecting the rate of production of recombination artefacts in the case of IgV genes, in order to develop a protocol that minimizes artefacts and thus maximizes accuracy and efficiency of the identification of new IgV genes. The rationale developed is relevant to similar work with other multigene families or viruses which mutate rapidly during infection (e.g. HIV).

We have shown that recombination events routinely occur during the PCR amplification of mixtures of templates with very similar sequences, and that the number and complexity of the recombinant molecules produced increases with cycle number. Pfu DNA polymerase produced more artefacts and more complex artefacts than Taq, particularly at high cycle numbers. This occurred despite the fact that Taq generated higher DNA concentrations than Pfu, presumably because of more rapid strand extension with Taq (which lacks the proofreading activity of Pfu). We tested three different thermal cyclers and concluded that the newer models that incorporate features better able to control temperatures in the reaction tubes are more efficient, with comparable PCR products obtained after fewer cycles of amplification. Adversely, this higher efficiency could also lead to increased PCR recombination (at comparable cycles).

The data presented here indicate that by far the most important variable in minimizing PCR recombination artefacts is the number of amplification cycles. Using the minimum number of cycles required to obtain a clonable product virtually eliminates recombination. Certainty of the accuracy of a particular gene sequence is then guaranteed by the re-isolation of the identical sequence from an independent amplification. While it is clear that Taq is superior to Pfu with respect to generation of recombination artefacts at high DNA concentration in late cycles, it has the disadvantage of introducing more base substitution errors. At minimum cycle number, Pfu may be a more appropriate choice. However, batch variation in Pfu with respect to both recombination artefacts and base substitution errors is a potential problem. While there are differences between various thermal cyclers with respect to efficiency and accuracy of temperature control in reaction tubes, use of minimum cycle number is of much greater importance than choice of equipment.

Other reports involving PCR recombination between two different template sequences have also shown this positive correlation between PCR recombination and cycle number,15, 16 but many recombination events would have involved identical templates and thus remained undetected. In the present study, the detection of recombination events was enhanced by the use of mixtures of up to 11 different template sequences in amplification reactions. A number of other experiments carried out to study PCR recombination have involved various methods identifying only the extreme 5' and 3' ends of PCR products.10, 15, 16 Again, with multiple recombination events within a single sequence going undetected, the rate of PCR recombination would have been underestimated.

Incomplete extension is generally used to explain the source of recombinant molecules.8, 10, 17 Shortened oligonucleotides, resulting from fragmented DNA or DNA polymerase dissociation from the template molecule, reanneal to a different template sequence in subsequent amplification cycles, creating a hybrid molecule upon further extension. Recombination is likely to increase once the plateau phase of the reaction is reached, because incomplete extension products become more successful at competing with the depleted primer. In addition, the complexity of the recombinant molecules produced in the later amplification cycles suggests that after the plateau phase is reached the Pfu DNA polymerase may degrade the DNA templates, creating random fragments that are then incorporated into a recombinant sequence. Degradation of primers in PCR amplifications using Pfu DNA polymerase14 supports this possibility. Taq DNA polymerase is unlikely to produce random degradation, although it has been shown that cleavage of DNA templates may occur at the site of stable hairpin loops.18, 19 This is consistent with the decrease in the intensity of the PCR product and with the increase in background smearing seen on the agarose gel in the final rounds of amplifications using Pfu.

It has been demonstrated that DNA polymerases, both with and without 3'–5' exonuclease activity, can switch between a newly synthesized strand and an oligonucleotide, providing there exists 7–9 b.p. of sequence complementarity between the 3' end of the newly synthesized strand and the oligonucleotide.20 Switching from the template strand to an annealing fragment downstream is shown to increase with an increase of similarity between the two, and also with an increase in temperature.21 Therefore, DNA degraded by DNA polymerase activity may interfere in the extension of new DNA strands. Also, template switching to pre-existing templates or to the complementary nascent strand has been shown to create recombinant sequences after a single round of PCR amplification,22 indicating that recombinant molecules may be produced in the absence of smaller DNA fragments.

A possible mechanism suggested for the creation of recombination artefacts is that in the later stages of amplification, heteroduplexes may form between two single strands from different templates; following cloning of the heteroduplex into a plasmid vector and transformation, the host cell acts on the DNA sequence, editing the mismatched bases at random and thus creating a hybrid molecule.16 However, recombinant sequences identified in our experiments frequently contained sequence attributable to more than two different templates. A more likely site for the repair of such mismatched bases present in a heteroduplex would be during the amplification reaction, with the DNA polymerase recognizing and repairing the mismatched bases. As the number of cycles increases, the complexity of such molecules would also increase, with previously `corrected' sequences pairing with new DNA strands and undergoing further DNA editing. This would be more likely to occur in reactions using Pfu DNA polymerase than with Taq DNA polymerase, due to the proofreading activity of Pfu.

Recombinant molecules can be complex. We found many examples containing sequences donated by five or more templates in the test mixture. However, analysis of the location of recombination events in all recombinant molecules, both single and multiple, clearly showed that with both Pfu and Taq the distribution was random. This finding was important because it meant that the probability of isolating an identical recombinant from independent PCR amplifications was low. These findings conflict with a report of PCR recombination events being clustered around three distinct sites within a 275-b.p. sequence.10 However, in that analysis a number of identical recombinant sequences were included.10 If identical recombination events were scored more than once, the actual number of recombinant sequences may have been overestimated (i.e. identical recombinant sequences from a single PCR amplification are not the result of independent recombination events, but amplified copies of a single event). Therefore, the identification of clusters of recombination events would be false.

In the light of the present results, minimization of artefactual PCR recombinant sequences may be accomplished by the following.

(1) Employing the minimal possible number of PCR amplification cycles required to generate a clonable product in order to limit the number of recombinant molecules produced by incomplete extension. This will also avoid either possible enzyme-directed degradation of the PCR product or formation of mismatched heteroduplexes, both of which occur after the maximal DNA concentration is reached (plateau phase), and increase with further cycling (over-amplification).

(2) Employing a DNA polymerase lacking 3'–5' endonuclease (proofreading) activity, to prevent degradation of the template and to minimize enzyme-generated recombinant molecules produced in later cycles.

The first of these measures is by far the most significant. Finally, because the probability of isolating identical artefacts is extremely low, it may be assumed that identical sequences isolated from independent PCR amplifications are genuine.

Polymerase chain reaction recombination will pose problems when the aim is to sample from a pool of very similar DNA sequences (rapidly evolving viral quasispecies, multigene families). Polymerase chain reaction recombination may also affect studies where different types of DNA may be co-amplified by the same primer set, such as endogenous and transgenic IgV sequences, germline V genes and somatically mutated V(D)J sequences,23, 24, 25 and variants of HIV within individual AIDS patients.26 Rational application of the variables investigated here should improve the accuracy and efficiency of any project involving PCR amplification of highly similar sequences.

Top

References

  1. Rothenfluh HS, Steele EJ. Origin and maintenance of germ-line V genes. Immunol. Cell Biol. 1993; 71: 227–32. | PubMed | ChemPort |
  2. Rothenfluh HS, Gibbs AJ, Blanden R, Steele EJ. Analysis of patterns of DNA sequence variation in flanking and coding regions of murine germline immunoglobulin variable genes: Evolutionary implications. Proc. Natl Acad. Sci. USA 1994; 91: 12163–7. | Article | PubMed | ChemPort |
  3. Weiller GF, Rothenfluh HS, Zylstra P et al. Recombination signature of germline immunoglobulin variable genes. Immunol. Cell Biol. 1998; 76: 179–85. | Article | PubMed | ChemPort |
  4. Rothenfluh HS, Taylor L, Bothwell ALM, Both GW, Steele EJ. Somatic hypermutation in 5' flanking regions of heavy chain antibody variable regions. Eur. J. Immunol. 1993; 23: 2152–9. | PubMed | ISI | ChemPort |
  5. Both GW, Taylor L, Pollard JW, Steele EJ. Distribution of mutations around rearranged heavy-chain antibody variable-region genes. Mol. Cell. Biol. 1990; 10: 5187–96. | PubMed | ISI | ChemPort |
  6. Saiki RK, Gelfand DH, Stoffel S et al. Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 1988; 239: 487–91. | Article | PubMed | ISI | ChemPort |
  7. Rothenfluh HS. Somatic hypermutation and germline evolution of immunoglobulin variable genes. PhD thesis, University of Wollongong, Wollongong, New South Wales, Australia, 1994.
  8. Paabo S, Irwin DM, Wilson AC. DNA damage promotes jumping between templates during enzymatic amplification. J. Biol. Chem. 1990; 265: 4718–21. | PubMed | ISI | ChemPort |
  9. Ford JE, McHeyzer-Williams MG, Lieber MR. Chimeric molecules created by gene amplification interfere with the analysis of somatic hypermutation of murine immunoglobulin genes. Gene 1994; 142: 279–83. | Article | PubMed | ISI | ChemPort |
  10. Meyerhans A, Vartanian J-P, Wain-Hobson S. DNA recombination during PCR. Nucleic Acids Res. 1990; 17: 1687–91.
  11. Maniatis T, Fritsch EF, Sambrook J. Molecular Cloning. Cold Spring Harbor: Cold Spring Harbor Laboratory Press, 1982.
  12. Chomczynski P. One-hour downward alkaline capillary transfer for blotting of DNA and RNA. Anal. Biochem. 1992; 201: 134–9. | Article | PubMed | ISI | ChemPort |
  13. Bell DA, DeMarini D. Excessive cycling converts PCR products to random-length higher molecular weight fragments. Nucleic Acids Res. 1991; 19.
  14. Stratagene Pfu DNA Polymerase Instruction Manual, Revision 056001a. 1996.
  15. Wang GC-Y, Wang Y. The frequency of chimeric molecules as a consequence of PCR co-amplification of 16s rRNA genes from different bacterial species. Microbiology 1996; 142: 1107–14. | PubMed | ChemPort |
  16. Yang YL, Wang G, Dorman K, Kaplan AH. Long polymerase chain reaction amplification of heterogeneous HIV Type 1 templates produces recombination at a relatively high frequency. AIDS Res. Hum. Retroviruses 1996; 12: 303–6. | PubMed | ChemPort |
  17. Shuldiner AR, Nirula A, Roth J. Hybrid DNA artifact from PCR of closely related target sequences. Nucleic Acids Res. 1989; 17: 4409. | PubMed | ChemPort |
  18. Cariello NF, Thilly WG, Swenberg JA, Skopek TR. Deletion mutagenesis during polymerase chain reaction: Dependence on DNA polymerase. Gene 1991; 99: 105–8. | Article | PubMed | ChemPort |
  19. Tombline G, Bellizzi D, Sgaramella V. Heterogeneity of primer extension products in asymmetric PCR is due both to cleavage by a structure-specific exo/endonuclease activity of DNA polymerases and to premature stops. Proc. Natl Acad. Sci. USA 1996; 93: 2724–8. | Article | PubMed | ChemPort |
  20. Guieysse A-L, Praseuth D, Grigoriev M, Harel-Bellan A, Helene C. Oligonucleotide-directed switching of DNA polymerases to a dead-end track. Biochemistry 1995; 34: 9193–9. | Article | PubMed | ChemPort |
  21. Patel R, Lin C, Laney M, Kurn N, Rose S, Ullman EF. Formation of chimeric DNA primer extension products by template-switching onto an annealed downstream oligonucleotide. Proc. Natl Acad. Sci. USA 1996; 93: 2969–74. | Article | PubMed | ChemPort |
  22. Odelberg SJ, Weiss RB, Akira H, White R. Template-switching during DNA synthesis by Thermus aquaticus DNA polymerase 1. Nucleic Acids Res. 1995; 23: 2049–57. | PubMed | ChemPort |
  23. Rogerson BJ. Mapping the upstream boundary of somatic mutations in rearranged immunoglobulin transgenes and endogenous genes. Mol. Immunol. 1994; 31: 83–98. | Article | PubMed | ISI | ChemPort |
  24. Anderson MK, Shamblott MJ, Litman RT, Litman GW. Generation of immunoglobulin light chain gene diversity in Raja erinacea is not associated with somatic rearrangement, an exception to a central paradigm of B cell immunity. J. Exp. Med. 1995; 182: 109–19. | Article | PubMed | ChemPort |
  25. Varade WS, Carnaham JA, Kingsley PD, Insel RA. Inherent properties of somatic hypermutation as revealed by human non-productive VH6 immunoglobulin rearrangements. Immunology 1998; 93: 171–6. | Article | PubMed | ChemPort |
  26. Howell RM, Fitzgibbon JE, Noe M et al. In vivo sequence variation of the human immunodeficiency virus type 1 env gene: Evidence for recombination among variants found in a single individual. AIDS Res. Hum. Retroviruses 1991; 7: 869–76. | PubMed | ChemPort |
Top

Acknowledgements

We thank Roy Riblet for critical discussions which helped initiate the project. This work was supported in part by the Australian Research Council.

Extra navigation

.
ADVERTISEMENT