Main

Taq polymerase is a celebrated hero. In the polymerase chain reaction (PCR) to amplify DNA, Taq is a widely used enzyme that assembles the complementary nucleotides along a single-stranded DNA template. Taq was isolated from a bacterium native to hot springs, which made it the enzyme of choice once PCR was invented at Cetus Corporation. The polymerase handles heat, and therefore the PCR's thermocycling, well. Science named Taq polymerase the 1989 molecule of the year.

Fidelity and infidelity are part of the PCR world. Credit: Gianluca Rasile/iStock

Taq is a frequently pummeled hero. Because it is a low-fidelity DNA polymerase, the DNA it amplifies is likely to include errors. Adenine does not always find its partner thymine, and cytosine is not always set up with guanine. Unlike many other DNA polymerases, Taq does not 'proofread' its work, in which a polymerase excises erroneous nucleotides and replaces them with the right ones.

Generally speaking, Taq leads to around 1 mutation in 100,000 base pairs, says bio-engineer Masood Hadi, who previously worked at a National Aeronautics and Space Administration lab and at Lawrence Berkeley National Laboratory, and is now at a synthetic biology company where he, among other tasks, runs DNA production.

Companies sell Taq and various versions of Taq that have been engineered to enhance Taq's fidelity. In addition, vendors offer 'high-fidelity' enzymes, which they describe as being a certain number of times better than Taq. 'High-fidelity' can mean 1 mutation in 1,000,000 base pairs, and academic labs have reported higher fidelity still.

These numbers don't tell all about these DNA polymerases. Polymerases have distinct qualities that shape their behavior during DNA amplification. Some polymerases such as Taq are fine for routine PCR tasks but fail with amplicons around 1.5 kilobases (kb) to 2 kb in length and larger. Only some are right for amplifying bisulfite-treated DNA to find sites of DNA methylation; some are suited for cloning, whereas others work better for finding rare mutations or for preparing sequencing libraries, and still others work for random mutagenesis experiments, or research projects with damaged DNA or DNA extracted from fossils or from blood. Scientists must match the enzyme and its properties and fidelity requirements to their experiment.

The polymerase can be at fault when PCR fails, as can other factors, which are too time-consuming to all chase down. After all, says Hadi, researchers want to, for example, clone the gene, express it and move forward to “where you can do some exciting science.” Not considering DNA polymerase fidelity, however, can also be a time-sink.

For research projects at the Joint BioEnergy Institute, which is part of the US Department of Energy, Hadi and colleagues were automating high-throughput cloning. But PCR results with, among others, microbial, yeast and rice DNA led to results that were “all over the place,” ranging from good to mediocre to failed amplification1. “It was very frustrating to have high-throughput methods that weren't generalizable across different species,” he says. Such experiences led the team to perform error-rate comparisons for six commonly used DNA polymerases: their fidelity was 4–50 times better than Taq's.

Model with caution

The goal with high-fidelity PCR enzymes is to introduce very few, if any, spurious PCR errors into amplified DNA, particularly when using PCR to amplify DNA to detect biologically relevant mutations, and especially with rare mutations linked to disease, says Myron Goodman of the University of Southern California, who studies mutagenesis and fidelity of DNA synthesis2,3. Deep sequencing and computational methods help labs to identify and eliminate PCR error, but it saves time to avoid such errors when possible.

To measure fidelity, researchers can use differing radioactive markers on the 'right' and the 'wrong' nucleotides. Given that high-fidelity enzymes make few errors, this way of measuring fidelity challenges experimenters. Researchers can sequence the outcome of the PCR reaction, but they might also want to model and test how nucleotides vie for positions on the emerging strand. One measurement approach is to place a radioactive label on the primer DNA strand and run a polyacrylamide gel, which, says Goodman, makes the 'right' and 'wrong' incorporated primers “a piece of cake to identify.” But that's neither an easy nor a routine experiment to do.

Chemist Alan Fersht of the University of Cambridge developed a thought experiment to deduce fidelity by modeling the incorporation kinetics for 'wrong' and 'right' nucleotides with the help of separate reactions. This approaches fidelity indirectly, says Goodman, but it remains a model for what is happening as the 'wrong' and 'right' nucleotides compete for slots on the emerging DNA strand. Although Goodman's team has confirmed this model for DNA polymerase activity in challenging, direct competition experiments to explore “who wins and who loses,” the issue he sees is that the model of the way nucleotides compete for slots on the DNA strand cannot always reliably predict outcome.

Scientists might be seduced to avoid the difficult direct competition experiment and assume that separate kinetic measurements will predict fidelity just as well as the direct competition result. But as Goodman explains, even though the model of direct competition is both good and useful, it is a flaw to believe that it always holds true experimentally. There are many DNA polymerases, and as researchers develop new ones, they need to bear in mind that kinetic measurements made with individual enzymes do not tell all of what will happen when nucleotides compete directly, he says. Doing many direct competition experiments is also not a guaranteed fidelity measurement. This, he says, is a cautionary tale for the lab.

Vendor perspective

“All polymerases are not created equal,” says Nicole Nichols, who leads a research group developing amplification reagents including polymerases at New England BioLabs (NEB) (see Box 1, 'Handy history'). “Understanding the limits of the enzyme you have can be very important and very relevant to your application,” she says. NEB has an online polymerase selection chart covering its 25 different proprietary DNA polymerases, including several versions of Taq. Taq can be fine for routine tasks, but it has difficulty amplifying DNA regions with high GC content, which other enzymes handle better. Engineered Taqs can polymerize through slightly longer genomic targets and handle higher GC content than Taq, and they work in experiments in which fidelity is not absolutely critical, she says. A version of Taq works on bisulfite-treated DNA.

NEB's higher-fidelity polymerases are better for amplifying GC-rich genomic regions, for using with primer extensions at elevated temperatures, for preparing sequencing libraries, says Nichols. In addition to polymerase choice, researchers will want to consider buffers, such as those geared toward GC-rich DNA or longer DNA templates.

For applications with very high fidelity requirements, Nichols recommends NEB's Q5 polymerase, which has around 100–200 times the fidelity of Taq. The measurements were done in-house, she says, and have been replicated by customers and external collaborators. “The enzyme pushed the assay to its limits,” says Tom Evans, who is NEB's scientific director of DNA enzymes.

The fraction of clones with errors increases linearly with target size, says Holly Hogrefe. Credit: Agilent
Polymerases are engineered to have different qualities, say Nicole Nichols and Tom Evans. Credit: Lisa Maduzia, New England BioLabs

In some experiments, fidelity might not be as central a concern as, for example, detection sensitivity, says Holly Hogrefe, an enzymologist who develops library preparation reagents and techniques for high-throughput sequencing and target enrichment in Agilent's genomics research and development division. For cloning experiments, DNA fragments should be amplified with a high-fidelity proofreading enzyme to minimize the percentage of clones with errors and the number of clones one needs to sequence. Mutation frequency—the fraction of clones with errors—increases linearly with target size.

When replication fidelity is a priority, the influences to consider are, in their order of importance, in Hogrefe's view: the intrinsic error rate of the PCR enzyme in its recommended buffer; the number of target doublings; changes in the buffer's pH or magnesium concentration or increases in nucleotide concentration, which can increase polymerase error rates by up to twofold; and challenging genomic sequences such as homopolymers or repeats, which can lead to insertion and deletion errors.

When PCR fails, says Hogrefe, she and her team will probe whether a researcher is using a PCR enzyme that's appropriate for the target size and application. If primer design and template quality can be excluded as causing PCR trouble, she will cascade through options: if Taq is failing, a lab will want to try a different kind of enzyme such as a mixture with fidelity slightly higher than Taq's or a fusion with proofreading capacity. Should the yield be too low with a fusion, researchers might want to go back to a blend, instead of another proofreading fusion, which will help them avoid sensitivity issues. To avoid nonspecific bands, a scientist might want to consider a hot-start PCR enzyme or optimize the cycling conditions.

Pfu was an early DNA polymerase for PCR, and unlike Taq, Pfu has proofreading capability. Agilent sells several high-fidelity Pfu-based enzymes and has been improving fidelity and robustness, says Hogrefe. “What most people don't realize is that Taq is fairly accurate considering it lacks proofreading activity,” she says. “When you knock out Pfu's proofreading activity, its error rate is significantly higher than that of Taq.”

Agilent has specialty polymerases such as one best used for high-fidelity TA cloning, a restriction-enzyme-free cloning method, and polymerases resistant to blood or uracil. They also offer Mutazyme II, a blend of error-prone polymerases used to create a uniform spectrum of mutations such as in random mutagenesis experiments. The company tests enzymes in a variety of ways: to document robustness, scientists test the enzymes on multiple targets of varying length, GC content and copy number.

With fusion polymerases such as iProof, Sso7d, which is a double-stranded DNA-binding protein, is fused to the polymerase. Credit: Bio-Rad
There are ways to reduce the likelihood that a polymerase will disassociate from a DNA template, says Yann Jouvenot. Credit: Bio-Rad

A polymerase can disassociate from a DNA template for a variety of reasons, says Yann Jouvenot, who is part of the Bio-Rad team that develops and markets gene expression reagents such as polymerases. One issue, he says, is that a polymerase might encounter secondary structures such as stem-loops. To increase their Pyrococcus-derived Taq polymerase's affinity for DNA so it can amplify longer DNA stretches without disassociating, Bio-Rad researchers developed a high-fidelity fusion polymerase called iProof, which has proofreading capability and fidelity around 52 times that of Taq.

In iProof, a double-stranded DNA-binding protein from Sulfolobus solfataricus called Sso7d is fused to the polymerase. This high-fidelity polymerase is used by scientists to amplify genes for cloning into vectors as well as for preparing high-throughput sequencing libraries, which they can use with Illumina-provided adaptors, he says.

“1.5 to 2 kb starts getting a little long for regular Taqs,” says Jouvenot, which is when it is important to consider switching to a different enzyme. And although fidelity is sometimes expressed as the number of mutations per base pair, “it's tough to compare in absolute numbers” he says. With a 2-kb amplicon, a polymerase error rate of 10−5mutations per base pair shouldn't, in theory, affect an experiment, unless a scientist is unlucky. Given that an experimenter will do several cycles of amplification, “you might be unlucky,” he says; it all depends on the experimental context.

PCR-based DNA amplification errors can be avoided with experimental tweaks such as varying the thermal profile, says Ed Smith, a researcher at Roche Molecular Systems who develops PCR chemistries and PCR technology. Buffer formulations matter, too, and there is an important balance to strike: higher divalent metal concentrations may promote robustness, but at the same time they are detrimental to enzyme fidelity, he says.

“Conditions that promote higher specificity can also promote higher fidelity,” says Smith. A hot-start polymerase can help to promote specificity, given that many mis-extension events, such as the creation of primer dimer, are triggered at lower reaction set-up temperatures, he says. Roche sells the Expand High Fidelity PCR system, which is an enzyme blend tested for amplicons up to 5 kb in length. The blend includes Taq and a proofreading enzyme that will remove mis-incorporated bases so that the correct base can be inserted and the amplification extension can continue more efficiently, he says. Mis-incorporation events can stall DNA polymerase extension and limit the efficient generation of longer PCR products.

Naturally occurring DNA polymerases are said to have a protein structure that looks like a hand, with domains similar to fingers, a palm and a thumb. Company scientists modify this 'hand' to address changing lab needs. Such changes, says NEB's Evans, make these polymerases quite unlike naturally occurring polymerases. “There's a hand but it doesn't look like any hand quite out there,” he says.

With sequencing, which is a growing PCR application, labs are asking the polymerase to amplify thousands or even hundreds of thousands of amplicons. “That's a very different problem than amplifying one single amplicon with one set of primers; this is millions of amplicons with one set of primers,” says Evans. The NEB researchers have tweaked Q5 to make it amenable to that level of genomic coverage, he says.

Another polymerase growth area comes from labs amplifying DNA extracted directly from blood. “Heme is a known inhibitor of polymerases,” says Evans, and Taq is particularly sensitive to heme, which led the company to engineer DNA polymerases suited for that application.

Academic perspective

The error rate of a proofreading-proficient, high-fidelity, endogeneous polymerase in a eukaryotic cell is in the range of one error in a million nucleotides, says Tom Kunkel, a researcher at the National Institutes of Health who focuses on DNA replication fidelity4,5. “People have very different views as to what constitutes high fidelity,” he says. “And it's very much context-dependent.”

Some polymerases are faithful in that they continue copying DNA despite a lesion, such as when two pyrimidines are stuck to one another in the DNA backbone, says Kunkel. One example is DNA polymerase eta (Pol η), which, he says, is “one of the least accurate DNA polymerases.” It might match up nucleotides accurately only 90% of the time, which makes for fault-tolerant DNA replication. That can help explain how favorable mutations arise during development of the immune system, how cells survive all sorts of DNA-damaging insults, and how tumor cells survive DNA damage from chemotherapy.

For experiments geared toward genetic analysis, Kunkel and his team rarely use PCR, and when they do they sequence the amplified DNA such as the entire open reading frame of their gene of interest. “We do not make interpretations based on the accuracy of a PCR digestion product unless we very, very carefully consider signals to noise ratios,” he says.

High-throughput sequencing is a way to distinguish true mutation from artifacts, says bio-engineer Hadi, but statistical analysis of sequencing results can get challenging if the starting template is first amplified.

Researchers have an array of tricks to influence fidelity when working with DNA polymerases. With fossil DNA or damaged DNA, nicks or breaks make it hard for the DNA polymerase to assemble nucleotides. One trick Hadi uses to reduce the number of breaks prior to amplification is to first add DNA repair enzymes to a sample and incubate the DNA. And when a polymerase stalls when it reaches a GC-rich region, he sometimes will add an error-prone polymerase on purpose, which can kick-start the reaction. This enzyme will “die off” after a few thermocycles, after which he adds a high-fidelity enzyme.

Indirect factors to take into account with PCR-based DNA amplification include the fact that thermocycling deaminates cytosine, either cytosine in the pool of nucleotides added to a PCR reaction tube or a cytosine on the DNA backbone. Deaminated cytosine on the DNA backbone can appear to the polymerase as uracil, says Kunkel, which means there is a risk that what was a C-G pairing will become a T-A pairing. The deamination risk increases as the number of heating cycles rises.

Over the course of a PCR reaction, the nucleotide pool can become biased, with one type of nucleotide dominating the mix, says Hadi. Researchers rely on the spectrophotometer's accuracy to calculate concentrations but, he says, reagents are not always what they're supposed to be.

For microbial sequencing experiments, Hadi is concerned about the way polymerases are produced in yeast or bacteria. Even after purification, DNA from that organism might be retained with the polymerase, which is then amplified, especially with extended PCR cycles, he says. This stowaway DNA risks skewing experiments in which labs are comparing microbiomes from different environments.

Kunkel's lab mainly studies replication fidelity in budding yeast, which conserves the major enzymes in mice and humans. A view to the polymerase's crystal structure helps his lab study features that might affect fidelity. The team can increase or decrease fidelity by making strategic amino acid residue replacements and move from in vitro experiments to work in cells, yeast, human cells and mice to study the effects of the fidelity shift.

Kunkel says that he made some scientists and vendors uncomfortable when he first began publishing about the factors that lead to varying degrees of polymerase infidelity during DNA replication. That view has shifted, but it's still not exactly a pleasant truth that polymerases can be less faithful than labs think they are.

Scientists can calculate, based on first principles, the probability of events occurring in each cycle that can lead to errors in the PCR-amplified DNA, says Kunkel. “For the vast majority of uses of the PCR products, it's not particularly relevant,” he says. But in labs setting out to use individual molecules of the PCR reaction for further downstream work, “it's something to keep in mind.”