“We totally missed the possible role of ... [DNA] repair although ... I later came to realise that DNA is so precious that probably many distinct repair mechanisms would exist.” Francis Crick, writing in Nature, 26 April 1974 (ref. 1).

This retrospective reflection by Francis Crick, penned two decades after he and James Watson reported the structure of DNA, hints at the early perception of DNA as a highly stable macromolecular entity. This prevailing view at the time significantly delayed serious consideration of biochemical processes such as mutation and repair. It was once suggested by Frank Stahl that “the possibility that ... genes were ... subject to the hurly-burly of both insult and clumsy efforts to reverse the insults, was unthinkable.”2

But subsequent work on three 'R's' of DNA metabolism — replication (copying of DNA prior to each cell division), recombination (exchanges between different DNA molecules in a cell) and repair (restoration of altered DNA to its normal state) — revealed the dynamic state of DNA. It became apparent that DNA in all living organisms continually incurs a myriad of types of damage, and that cells have devised ingenious mechanisms for tolerating and repairing the damage. Failure of these mechanisms can lead to serious disease consequences, as well illustrated in the human hereditary diseases xeroderma pigmentosum (XP), hereditary non-polyposis colon cancer (HNPCC) and some forms of breast cancer. XP is characterized by about a 10,000-fold increased risk of skin cancer associated with sunlight exposure; individuals with HNPCC manifest an increased hereditary predisposition to colon (and other) cancer.

The roots of repair

The early work on DNA damage and repair in the 1930s was stimulated by a small but prominent group of physicists3. As recounted by the geneticist Guido Pontecorvo, “in the years immediately preceding World War II something quite new happened: the introduction of ideas (not techniques) from the realm of physics into the realm of genetics, particularly as applied to the problems of size, mutability, and self-replication of genes”4. Seminal to this coalition between physics and biology in pre-war Germany was the collaboration between German physicists Karl Zimmer and Max Delbrück and the Russian geneticist Nikolai Timoféeff-Ressovsky5. Their partnership was stimulated by the work of Hermann Muller, a geneticist working on the fruitfly Drosophila who first demonstrated that external agents, such as ionizing radiation, can cause mutations in living organisms6.

Timoféeff-Ressovsky and Zimmer were interested primarily in how such small amounts of energy in the form of ionizing radiation (formally equivalent to no more than the amount of energy absorbed as heat by drinking a cup of hot tea) could have such profound biological effects3. Delbrück and Muller, on the other hand, were intrigued by whether such mutations could reveal insight into the physical nature of the gene.

In retrospect, it was inevitable that the deployment of physical (and later chemical) tools, such as ionizing and ultraviolet (UV) radiation, to study genes would in due course lead to questions as to how these agents damaged DNA3. And, once it was recognized that these interactions promoted deleterious effects on the structure and function of genes, to questions concerning how cells cope with damaged DNA. Zimmer wrote, “one cannot use radiations for elucidating the normal state of affairs without considering the mechanisms of their actions, nor can one find out much about radiation induced changes without being interested in the normal state of the material under investigation.”7

Hints of the ability of living cells to recover from the lethal effect of UV radiation emerged as early as the mid-1930s8. But the discovery of a DNA-repair mechanism had to wait until the end of the 1940s, through the independent, serendipitous observations of Albert Kelner9 working in Milislav Demerec's group at the Cold Spring Harbor Laboratory, and Renato Dulbecco10 in Salvador Luria's laboratory at The University of Indiana. Neither Kelner nor Dulbecco set out to study damage to DNA or its repair. They were both using UV radiation as an experimental tool, but observed anomalous survival rates when cells or bacteriophage (bacteria-infecting viruses) were inadvertently exposed to long-wavelength light, either as sunlight or fluorescent light in their respective laboratories9,10. Their efforts to explain these confounding observations led to the discovery of the phenomenon now known as photoreactivation, whereby the DNA damage incurred by exposure to UV light is repaired by a light-dependent enzyme reaction11 (Fig. 1).

Figure 1: Photoreactivation reverses DNA damage.
figure 1

DNA exposed to ultraviolet (UV) radiation results in covalent dimerization of adjacent pyrimidines, typically thymine residues (thymine dimers), illustrated here as a purple triangle. These lesions are recognized by a photoreactivating enzyme, which absorbs light at wavelengths >300nm (such as fluorescent light or sunlight) and facilitates a series of photochemical reactions that monomerize the dimerized pyrimidines, restoring them to their native conformation.

Curiously, even with the elucidation of the structure of DNA only four years away, neither Dulbecco nor Watson — who was a graduate student in Luria's laboratory when Dulbecco stumbled on photoreactivation, and had himself examined the effects of ionizing radiation for his doctoral thesis2 — thought about DNA repair. However, shortly after Watson and Crick reported on the DNA double helical structure, they noted the implications of the base-pairing rules for mutagenesis, stating “spontaneous mutation may be due to a base occasionally occurring in one of its less likely tautomeric forms”12.

Tautomerism is the property of a compound that allows it to exist in two interconvertible chemical states; in the case of DNA bases, as either keto or enol forms. Watson and Crick had initially overlooked the complications of tautomerism and were trying unsuccessfully to construct their DNA model with the rare enol form of bases. It was only after Jerry Donohue, a former graduate student of Linus Pauling, pointed out to Watson that he should be using the more common keto form that the problem of how bases could stably pair was solved13.

But no consideration was then given to the fact that the chemical lability of DNA implicit in tautomerism might have wider implications for the stability of genes. Indeed, the field gave little thought to the precise nature of DNA damage and its possible biological consequences. One must recall, however, that even at the time the DNA double helix was unveiled, its 'pathology' and the biological consequences thereof were far less compelling problems than deciphering the genetic code or understanding the essential features of DNA replication. Even mutagenesis — put to extensive use as a tool for determining the function of genes and their polypeptide products, and for defining the genetic code — was not widely considered in mechanistic terms until much later11. This is despite the fact that the repair phenomenon of photoreactivation was known before the discovery of the structure of DNA.

A DNA duplex for redundancy

Watson and Crick noted, with infamous prophetic understatement, “it has not escaped our notice that the specific [base] pairing we have postulated immediately suggests a possible copying mechanism for the genetic material”14. However, it was not intuitively obvious that a double-stranded molecule should be required for DNA replication. In principle, a single-stranded chain could just as easily do. But the significance of the duplex DNA structure soon became apparent. It was shown that DNA replicates in a semi-conservative fashion, whereby each strand of the double helix pairs with a new strand generated by replication. This enables errors introduced during DNA replication to be corrected by a mechanism known as excision repair, which relies on the redundancy inherent in having two complementary strands of the genetic code. If the nucleotides on one strand are damaged, they can be excised and the intact opposite strand used as a template to direct repair synthesis of DNA15 (Fig. 2).

Figure 2: Responses to DNA damage.
figure 2

DNA damage (illustrated as a black triangle) results in either repair or tolerance. a, During damage tolerance, damaged sites are recognized by the replication machinery before they can be repaired, resulting in an arrest that can be relieved by replicative bypass (translesion DNA synthesis) (see Fig. 3). b, DNA repair involves the excision of bases and DNA synthesis (red wavy lines), which requires double-stranded DNA. Mispaired bases, usually generated by mistakes during DNA replication, are excised as single nucleotides during mismatch repair. A damaged base is excised as a single free base (base excision repair) or as an oligonucleotide fragment (nucleotide excision repair). Such fragments are generated by incisions flanking either side of the damaged base. Nucleotide excision repair can also transpire in some organisms by a distinct biochemical mechanism involving only a single incision next to a site of damage (unimodal incision). c, The cell has a network of complex signalling pathways that arrest the cell cycle and may ultimately lead to programmed cell death.

Many paths to mutation and repair

The elucidation of the DNA structure provided the essential foundation for defining the different types of mutations arising from both spontaneous and environmental DNA damage that affect all living cells12. Once again, the insights of physicists featured prominently3, including among others, Richard Setlow who identified thymine dimers as stable and naturally occurring DNA lesions arising in cells exposed to sunlight (UV radiation). Such lesions comprise a covalent joining of two adjacent thymine residues in the same DNA chain. They generate considerable distortion of the normal structure of DNA and seriously impede DNA transactions such as replication and transcription. The repair of these lesions could be monitored experimentally, and promoted the discovery by Setlow16 and others2 of excision repair in bacteria and higher organisms17.

As the profusion of alterations in DNA became more widely recognized, scientists came to appreciate that the identification of any new type of naturally occurring base damage would, if one searched diligently enough, almost certainly lead to the discovery of one or more mechanisms for its repair or tolerance2,18. Such has indeed been the case. DNA repair now embraces not only the direct reversal of some types of damage (such as the enzymatic photoreactivation of thymine dimers), but also multiple distinct mechanisms for excising damaged bases, termed nucleotide excision repair (NER), base excision repair (BER) and mismatch repair (MMR)11 (Fig 2). The principle of all three mechanisms of repair involves splicing out the damaged region and inserting new bases to fill the gap, followed by ligation of the pieces.

The process of NER is biochemically complicated, involving as many as 30 distinct proteins in human cells that function as a large complex called the nucleotide excision repairosome. This 'repair machine' facilitates the excision of damaged nucleotides by generating bimodal incisions in the flanking regions and removing a fragment about 30 nucleotides in length11 (Fig. 2). Damaged bases that are not recognized by the NER machinery are corrected by BER, whereby the bases are excised from the genome as free bases by a different set of repair enzymes. In MMR, incorrect bases incorporated as a result of mistakes during DNA replication are excised as single nucleotides by yet a third group of repair proteins (Fig. 2). Both NER and BER transpire by somewhat different mechanisms depending on whether the DNA damage is located in regions of the genome that are undergoing active gene expression (transcription-coupled repair) or are transcriptionally silent (global genome repair)11,19.

In addition to the various modes of excision repair that evolved to cope with damaged bases or mistakes during replication, cells frequently suffer breakage of one or both chains of the DNA duplex11. Naturally occurring reactive oxygen molecules and ionizing radiation are prevalent sources of such damage11. Strand breaks must be repaired in order to maintain genomic integrity. In particular, double-strand breaks (DSBs) sever the chromosomes and are lethal unless repaired11.

Several mechanisms for the repair of DSBs have been elucidated (Fig. 3). One of these involves swapping equivalent regions of DNA between homologous chromosomes — a process called recombination11. This type of exchange occurs naturally during meiosis, the special type of cell division that generates the germ cells (sperm and ova). It can also be used to repair a damaged site on a DNA strand by using information located on the undamaged homologous chromosome. This process requires an extensive region of sequence homology between the damaged and template strands. Multiple proteins are required for DSB repair by recombination and deficiencies in this repair mechanism can cause cancer. For example, mutation of at least one of these repair proteins (called BRCA1) causes hereditary breast cancer. An alternative mechanism for the repair of DSBs, called non-homologous end joining, also requires a multi-protein complex, and essentially joins broken chromosome ends in a manner that does not depend on sequence homology and may not be error free (Fig. 3).

Figure 3: The repair of double-strand breaks in DNA.
figure 3

Double-strand breaks can result from exposure to ionizing radiation, oxidative damage and the spontaneous cleavage of the sugar-phosphate backbone of the DNA molecule. Their repair can be effected by either rejoining the broken ends (left) or by homologous recombination with a sister DNA molecule (right). Both processes involve different multi-protein complexes.

Damage tolerance

Although insights into DNA repair have progressed at an impressive pace, especially in the past decade, an understanding of the mechanisms of mutagenesis — a phenomenon that, as mentioned earlier, was demonstrated experimentally before discovery of the structure of DNA — has lagged. A breakthrough came from the experimental demonstration that some mutations arise as a consequence of a cell's efforts to tolerate damage. In this situation, the base damage and/or strand breaks in DNA persist in the genome, but their potential for interfering with DNA replication and transcription is somehow mitigated.

One such damage-tolerance mechanism, called translesion DNA synthesis, involves the replication machinery bypassing sites of base damage, allowing normal DNA replication and gene expression to proceed downstream of the (unrepaired) damage20 (Fig. 4). It involves specialized low-fidelity ('sloppy') DNA polymerases that are able to bypass DNA lesions that typically stall the high-fidelity polymerases required for DNA replication. To overcome the block, these 'sloppy copiers' add nucleotides to the replicating strand opposing the DNA lesion, thus allowing replication to continue, but nevertheless introducing mutations into the newly synthesized sequence20.

Figure 4: 'Sloppy copiers' overcome blocks in replication caused by a DNA lesion (a process called translesion synthesis).
figure 4

a, The DNA replicative machinery (blue) stalls immediately behind a site of base damage (black triangle). Two specialized 'sloppy copier' polymerases (polη and polι) bind to the arrested replication complex. b, This interaction promotes a conformational change in the arrested replication machinery, placing polη in direct proximity to the site of base damage where it synthesizes across the lesion. c, Polη may then dissociate and allow polι to complete the process of replicative bypass by incorporating several more nucleotides (red crosses). Once the lesion has been completely bypassed, the replication machinery resumes DNA replication. As a result of this process, mutations to the DNA sequence are now incorporated into one strand.

Cell suicide

Recent years have witnessed the recognition that biological responsiveness to genetic insult embraces more than the repair and tolerance of DNA damage. The exposure of cells to many DNA-damaging agents results in the transcriptional upregulation of a large number of genes, the precise function(s) of many of which remains to be established. Additionally, cells have evolved complex signalling pathways to arrest the progression of the cell cycle in the presence of DNA damage, thereby providing increased time for repair and tolerance mechanisms to operate21 (Fig. 2c). Finally, when the burden of genomic insult is simply too large to be effectively met by the various responses discussed, cells are able to initiate programmed cell death (apoptosis), thereby eliminating themselves from a population that otherwise might suffer serious pathological consequences22.

DNA damage and cancer

The 'somatic mutation hypothesis' of cancer embraces the notion that neoplastic transformation arises from mutations that alter the function of specific genes (now called oncogenes and tumour-suppressor genes) that are critical for cell division. This theory has its roots in correlations between chromosomal abnormalities and cancer first observed by the developmental biologist Theodore Boveri23, who at the beginning of the twentieth century reported abnormal numbers of chromosomes (aneuploidy) in cancerous somatic cells.

The discovery of the structure of DNA progressed our understanding of tumorigenesis at several levels. Watson and Crick predicted from their DNA model that complementary base pairing had implications for recombination (the exchange of genetic material between chromosome pairs): “the pairing between homologous chromosomes at meiosis may depend on pairing between specific bases”. The genetic basis of many cancers is now known to arise from abnormal recombination events, such as chromosomal translocations, where a region of one chromosome is juxtaposed to another chromosome. Watson himself developed an early and ardent interest in cancer biology when he recognized that the experimentally tractable genomes of oncogenic viruses could provide important insights into the pathogenesis of cancer. Mutagenesis is now documented as a fundamental cornerstone of the molecular basis of all forms of cancer24.

Arguably the most definitive validation of the somatic mutation hypothesis derives from the discovery that defective responses to DNA damage and the accumulation of mutations underlies two distinct types of hereditary cancer; skin cancer associated with defective NER and colorectal cancer associated with defective MMR11. In both instances, credit belongs to scholars of DNA repair.

In the late 1960s, James Cleaver providentially noted an article in the San Francisco Chronicle that reported the extreme proneness to skin cancer in individuals with XP, a rare sun-sensitive hereditary disease2. Cleaver was then searching for mammalian cell lines that were defective in excision repair, and his intuitive notion that XP individuals might be sunlight-sensitive and prone to cancer because they were genetically defective in excision repair proved to be correct25.

The subsequent elucidation of the genes defective in XP patients26, and their role in NER of damaged bases in human cells11,27,28, represents a triumph of modern genetics and its application to molecular biology. The additional discovery that the process of NER in eukaryotes requires elements of the basic transcription apparatus11 has yielded insights into the complex relationships between deficient DNA repair, defective transcription and hereditary human diseases11.

A fascinating denouement to the skin-cancer predisposition in XP patients derives from the recent solution of the 'XP variant problem'. A significant fraction of XP individuals who are clinically indistinguishable from those defective in NER were found to be proficient in this repair process11. It was shown that DNA polymerase-η (polη), one of the specialized DNA polymerases capable of overcoming replication blocks at DNA lesions, is mutated in all XP-variant patients so far examined29. Not only does polη replicate past thymine dimers in DNA, but — unlike the other specialized DNA polymerases — it also correctly inserts adenine residues29, thereby preventing mutations at sites of thymine dimers. Therefore, even in XP patients with functional NER, in the absence of polη one or more other bypass polymerases attempts to cope with arrested replication at thymine dimers, but does so inaccurately29. Thus, cancer predisposition in XP essentially derives from an excessive mutational burden in skin cells associated with exposure to sunlight. These mutations accumulate either because thymine dimers are not excised (owing to defective NER) or because in the absence of polη, dimers are inaccurately bypassed by other DNA polymerases29.

The association between HNPCC and defective MMR was determined more-or-less simultaneously by a number of investigators. Paul Modrich2 surmised that the instability of repeated sequences in DNA associated with defective MMR in bacteria30 might be causally related to the DNA sequence instability observed in patients with HNPCC31. This led to the formal demonstration of defective MMR in this human hereditary disease and formed another persuasive validation of the somatic mutation theory of cancer32,33,34.

A look to the future

The study of biological responsiveness to DNA damage embraces DNA repair, mutagenesis, damage tolerance, cell-cycle checkpoint control, programmed cell death, and other cellular responses to genomic insult. This integrated field is now deciphering the complex regulatory pathways transduced by signalling mechanisms that detect DNA damage and/or arrested DNA replication. As these pathways become better understood, parallel technological gains in gene therapy and therapeutic intervention by rational drug design will offer new strategies for blocking the unwanted consequences of DNA damage, especially cancer.

We must remember, however, that while evolution could not have transpired without robust cellular mechanisms to ameliorate the most serious consequences of spontaneous and environmental DNA damage, the process of evolution mandates that the genetic diversification on which Darwinian selection operates be maintained constantly. Thus, life is necessarily a delicate balance between genomic stability and instability — and of mutation and repair.