Chemical Biology: Past, Present and Future

Grand Challenge Commentary: The chemistry of a dynamic genome

Journal name:
Nature Chemical Biology
Year published:
Published online

In the postsequencing era, chemical biology is uniquely situated to investigate genomic DNA alterations arising through epigenetic modifications, genetic rearrangements or active mutation. These transformations significantly expand nature's diversity and may profoundly alter our view of DNA's coding potential.

The sequencing of entire genomes, particularly the human genome, has arguably defined our most recent scientific epoch1, 2. However, the static picture that has emerged fails to convey the remarkably dynamic nature of DNA. If variety is the spice of life, then the grand challenge ahead for chemical biologists is to decipher how chemical transformation of this genomic information can yield such variety. Just as the approaches of chemical biology have helped to decipher the array of post-translational modifications that expand the functional scope of proteins, similar approaches can now be used to explore the modifications of genomic DNA that promote diversity. This diversity can arise at the level of modifying or mutating individual nucleobases or at the level of recombining larger blocks of DNA. The opportunity for chemistry, then, is to help define these modifications and the mechanisms by which they occur, ultimately allowing us to explore the nature of the dynamic genome.

Modifying the genome

Variety is evident within complex organisms, where diverse cell lineages are all derived from the same genome. In establishing specialized functions for cells, selective portions of genomic information are differentially amplified or muffled, with DNA cytosine methylation known to play a key role. Although methylation has been extensively explored, it is increasingly clear that the events regulating gene expression are composed of transformations beyond simple methylation. The recent discovery of mammalian oxygenase enzymes that generate 5-hydroxymethylcytosine suggests that the extent of purposeful modification of the genome has been underestimated3. More notably, standing as a conspicuous example of our limited understanding of genomic dynamics, the mechanism by which modified cytosine residues are returned to their unmodified state, a requirement for cellular pluripotency, remains unknown. DNA glycosylases, deaminases or oxygenases have all been postulated to play potential roles4, 5, 6. These same enigmatic mechanisms that reverse cytosine modification are also likely to be at play in the oncogenic transformations that alter gene expression to reveal the darker aspects of genomic potential7.

Chemical biology has multiple roles to play in deciphering genomic modifications (Fig. 1). Nucleobase analogs targeting methylation pathways have already demonstrated clinical utility, and epigenetic therapies offer increasing promise. Insights into the mechanism of methyltransferase enzymes have produced nucleotide inhibitors such as 5-azacytidine. These analogs have emerged as treatments for myelodysplastic syndrome, where aberrant methylation patterns are prominent8. Given that the functional implications of modifications beyond methylation are currently unknown, chemical biology has the potential to help elucidate and then perturb these other pathways that modify the genome. For example, chemical methods to detect site-specific modification of the genome, such as discriminating methylcytosine from hydroxymethylcytosine, can help dissect the significance of these alterations. A potentially useful but more formidable challenge is developing methods to chemically modify nucleobases in vivo to function as biological probes. Finally, given the promise of induced pluripotent cells, it is most pressing that we elucidate the enzymatic mechanisms by which cellular plasticity is restored through erasing epigenetic marks on the genome. As the demethylation mechanisms come to light, rational design of small molecules that can induce pluripotency can potentially follow.

Figure 1: The dynamic genome.
The dynamic genome.

Genomic diversity can arise from the modulation of the information already present, the emergence of latent information, the rearrangement of genomic information or the active introduction of mutations. Nucleotide analogs and small molecules that target DNA-modifying enzymes offer the opportunity to decode genomic dynamics in the era after genome sequencing. dC, deoxycytidine; 5-m-dC, 5-methyldeoxycytidine; 5-hm-dC, 5-hydroxymethyldeoxycytidine; dU, deoxyuridine; dN, random natural deoxynucleotide; dT, deoxythymidine; 5-Glu-OH-dT (dJ), β-D-glucosyl-hydroxymethyldeoxyuridine.

Reorganizing the genome

Although diversity can result from alteration of cytosine nucleobases, variations can also be latently pre-encoded into the genome and revealed through genetic rearrangements (Fig. 1). The dynamic nature of the genome is most apparent at the interface of pathogens and our immune systems, where diversity is a critical battle tactic. For example, in trypanosomes, which cause African sleeping sickness, numerous variable surface glycoproteins (VSGs) are encoded within telomeric chromosomal regions, but only a single VSG is expressed at any given time9. A hypermodified nucleobase, β-D-glucosyl-hydroxymethyluracil (base dJ) present within the silent loci is thought to be important for VSG antigenic variation, although the mechanism is not known10. The Neisseria gonorrhoeae genome is similarly notable for numerous silent pseudogenes upstream of the pilin expression locus. Recombination of the coding gene with the silent copies alters the pilin sequence to mask the prominent immune epitope and alter the tissue tropism of the pathogen. Sequence-encoded guanine quartets appear essential to the nonhomologous recombination pathway that promotes antigenic variation11. The principles of gene duplication and recombination, evident in these pathogens, are also at work in the generation of immunologic diversity in the hosts they invade. The VDJ recombinase catalyzes the selection of variable, diversity and joining segments from among numerous repeats within the immunoglobulin locus. The recombinase mechanism has been linked to the mechanisms of other dynamic, mobile genetic elements including retroviral integrase enzymes, extending the parallel between diversity-generating mechanisms in pathogens and the immune system12.

The expanding list of examples in which diversity results from genetic rearrangements now opens numerous avenues for exploration by chemical biology. For example, discovering trypanosome proteins that can specifically interact with synthetic dJ-containing DNA may help elucidate the pathogen's mechanism of antigenic variation and immune escape. Particularly intriguing is the question of the in vivo significance of guanine quartets in N. gonorrhoeae pilin alteration. By perturbing secondary structure elements through chemical alterations, the role of non–B-form DNA structure in evolution and genetic instability can be examined13. In addition to chemical manipulation of DNA, non-nucleoside chemical probes of these pathways are also likely to play an important role. HIV integrase inhibitors, which have an increasing role in antiretroviral therapy, are a notable example of the potential benefit of small molecules targeting pathways of genomic rearrangement. Further insights are likely to be gained by molecules that can alter the recombination pathways present in other pathogens or even to potentially examine the influence of the endogenous retroelements that heavily populate our genomes.

Rewriting the genome

Chemical modifications of the genome, however, extend beyond modulating or rearranging existing genetic information. Remarkably, the dynamic genome can also adapt through the active introduction of purposeful mutations (Fig. 1). Antibody affinity maturation is initiated by the immune mutator enzyme activation-induced deaminase (AID), which deaminates cytosine bases within the variable and switch regions of the immunoglobulin locus14. Deamination produces deoxyuridine, an unnatural base in DNA, which can subsequently seed diversity generation15, 16. Downstream of AID, many of the enzymes long reputed to be DNA 'repair' enzymes appear to serve a second function as mediators of purposeful mutation. Moving our focus from adaptive to innate defense mechanisms, the immune system also uses the active introduction of mutations to defend against invading pathogens. For example, the APOBEC3 deaminases can garble the genome of retroviral pathogens such as HIV through extensive cytosine deamination17. The active introduction of mutations is also adeptly used by pathogens to diversify in response to various stressors, including immune pressure, antimicrobials or metabolic challenges. Part of this response is the induction of error-prone polymerases, which can be useful to bypass mutagenic DNA lesions and serve to increase genetic diversity in the process18.

On multiple fronts, chemical approaches to mutator pathways can help elucidate the mechanisms of immune function and pathogenesis and may provide new potential approaches to antimicrobial therapy. For example, enhancing immune-mediated mutagenesis could help counteract retroviral pathogens. In the attack-counterattack paradigm that governs host-pathogen interactions, HIV uses the accessory protein Vif to target the innate defense factor APOBEC3G for ubiquitination and degradation. Recently, screening has revealed intriguing prospects for small-molecule inhibitors that can disrupt the Vif-APOBEC3 protein-protein interface to potentially promote innate defense against HIV19. Alternately, using chemical probes to target the evolutionary pathways that promote diversification under stress represents a new approach to address the pressing issue of multidrug resistance in pathogens20.

The chemical recipe for diversity

Chemical biology is well situated to reveal the dynamic genome, which in turn will alter our views on a multitude of fundamental biological processes, including cellular differentiation, epigenetic reprogramming and the interactions between the immune system and pathogens. Among the notable observations made by extensive sequencing projects is the recognition that the high level of diversity—whether in different cell lineages, among members of the same species or between closely related organisms—contrasts sharply with the considerable extent of genomic sequence conservation. To explain the remarkable diversity present in nature, our view of DNA must itself be transformed, and this can be achieved in the coming years by understanding the chemical alterations in DNA that confer diversity, through modulation of the genome, rearrangement or active mutagenesis. Strategies such as inhibiting DNA methyltransferases, inducing pluripotency with small molecules, elucidating the function of hypermodified nucleobases and targeting pathogen evolution are but a few examples of the promise offered by melding nucleic acid chemistry with mechanistic studies of the pathways that modify DNA. It is most fitting that the era after genomic sequencing offers unprecedented opportunities for chemistry to decode the spice of life.


  1. Lander, E.S. et al. Nature 409, 860921 (2001).
  2. Venter, J.C. et al. Science 291, 13041351 (2001).
  3. Tahiliani, M. et al. Science 324, 930935 (2009).
  4. Bhutani, N. et al. Nature 463, 10421047 (2010).
  5. Ooi, S.K. & Bestor, T.H. Cell 133, 11451148 (2008).
  6. Ito, S. et al. Nature 466, 11291133 (2010).
  7. Jones, P.A. & Baylin, S.B. Cell 128, 683692 (2007).
  8. Quintás-Cardama, A., Santos, F.P. & Garcia-Manero, G. Nat. Rev. Clin. Oncol. 7, 433444 (2010).
  9. Barry, D. & McCulloch, R. Nature 459, 172173 (2009).
  10. Borst, P. & Sabatini, R. Annu. Rev. Microbiol. 62, 235251 (2008).
  11. Cahoon, L.A. & Seifert, H.S. Science 325, 764767 (2009).
  12. Fugmann, S.D. Semin. Immunol. 22, 1016 (2010).
  13. Zhao, J., Bacolla, A., Wang, G. & Vasquez, K.M. Cell. Mol. Life Sci. 67, 4362 (2010).
  14. Muramatsu, M. et al. Cell 102, 553563 (2000).
  15. Peled, J.U. et al. Annu. Rev. Immunol. 26, 481511 (2008).
  16. Maul, R.W. & Gearhart, P.J. Adv. Immunol. 105, 159191 (2010).
  17. Chelico, L., Pham, P., Petruska, J. & Goodman, M.F. J. Biol. Chem. 284, 2776127765 (2009).
  18. Patel, M., Jiang, Q., Woodgate, R., Cox, M.M. & Goodman, M.F. Crit. Rev. Biochem. Mol. Biol. 45, 171184 (2010).
  19. Nathans, R. et al. Nat. Biotechnol. 26, 11871192 (2008).
  20. Smith, P.A. & Romesberg, F.E. Nat. Chem. Biol. 3, 549556 (2007).

Download references

Author information


  1. Rahul M. Kohli is in the Division of Infectious Disease in the Department of Medicine, and the Department of Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, Pennsylvania, USA.

Competing financial interests

The author declares no competing financial interests.

Corresponding author

Correspondence to:

Author details

Additional data