Main

All cells are coated with complex carbohydrates called glycans, which form a layer known as the glycocalyx, ranging from 10 to 100 nm in thickness1,2. Glycans are present in many different molecular forms, including glycoproteins, proteoglycans, glycolipids and glycophosphatidylinositol-linked proteins. Their broad diversity originates from their assembly from monosaccharide building blocks, which can be linked to each other at various positions on their pyranose or furanose rings. Each ring can establish several linkages, giving rise to branched structures. Finally, the structural complexity of glycans is further increased by the possibility of α- and β-isomers at the anomeric centre.

This dense structural information is decoded by carbohydrate-binding proteins, which are involved in important physiological and pathophysiological events. The need for an integrated approach to decipher the structure–activity relationships (SARs) between glycans and their protein receptors has led to the establishment of interdisciplinary collaborative efforts in the United States (Consortium for Functional Glycomics; see Further information), Europe (EuroCarb; see Further information) and Japan (Human Disease Glycomics/Proteome Initiative; see Further information).

Currently, over 80 carbohydrate-binding proteins have been identified. The binding specificities for many of them have been elucidated, and others are being screened on large glycoarrays to determine their glycan-binding epitopes. These discoveries have led to a renaissance in glycobiology. They also provide a continuous supply of carbohydrate-related targets for the structure-based design of new chemical entities that mimic bioactive carbohydrates, and form a novel class of therapeutics.

Carbohydrate and carbohydrate-derived drugs

Although carbohydrates play an important part in a vast array of biological processes, carbohydrate and carbohydrate-derived drugs cover only a limited area of the world of therapeutics (Fig. 1). Many pathophysiologically important carbohydrate–protein interactions have yet to be exploited as a source of new drug targets. One reason might be the pharmacokinetic drawbacks that are inherently linked to carbohydrates. As a result of their high polarity, they are unable to cross passively through the enterocyte layer in the small intestine — a prerequisite for oral availability. In addition, once systemically available by parenteral administration, carbohydrates suffer from fast renal excretion.

Figure 1: Carbohydrate and carbohydrate-derived drugs.
figure 1

Structures of currently approved drugs (trade name in brackets). These include glycosidase inhibitors that prevent the digestion of carbohydrates for the treatment of diabetes (voglibose4, miglitol5 and acarbose6) and the prevention of influenza virus infections (zanamivir7 and oseltamivir9); and sulphated glycosaminoglycans, which function as anticoagulants by binding to antithrombin III for the treatment of thrombosis (fondaparinux3, dalteparin161, ardeparin161, nardoparin161 and enoxaparin161). In addition, carbohydrate-derived drugs are used to treat Gaucher's disease (miglustat162), epilepsy (topiramate163) and osteoarthritis (sodium hyaluronate164).

When interactions with blood plasma components are possible, the plasma half-life that is required for a successful therapeutic application can be achieved. Prominent examples are the low-molecular-weight heparins, derived from animal tissue, and fondaparinux3 (Arixtra; GlaxoSmithKline), which are used as anticoagulants. In other cases — such as the inhibition of α-glycosidases in the brush border of the small intestine for the treatment of diabetes (by voglibose4 (Basen/Glustat/Volix; Takeda), miglitol5 (Glyset; Pfizer) and acarbose6 (Glucobay/Prandase/Precose; Bayer)) or the inhibition of viral neuraminidases in the pharyngeal mucosa (by zanamivir7 (Relenza; GlaxoSmithKline)) — oral availability is not required.

The paradigm of a glycomimetic drug in the classical sense is oseltamivir (Tamiflu; Gilead/Roche). Starting from a carbohydrate lead, drug likeness was achieved by systematically eliminating polar groups and metabolic 'soft spots'8 that were not required for affinity. Finally, by designing a prodrug, oral availability became possible9.

Glycodrugs in preclinical and clinical evaluation

Carbohydrate-binding proteins are broadly classified into lectins10 and sulphated glycosaminoglycan (SGAG)-binding proteins11,12. There are two categories of lectins present in vertebrates: the families of intracellular lectins (for example, calnexin, L-type and P-type lectins), which bind core oligosaccharide structures and are involved in glycoprotein processing and quality control, and the families of extracellular lectins (for example, galectins, C-type, I-type and R-type lectins), which recognize terminal carbohydrate epitopes of other cells and pathogens. Extracellular lectins account for most of the molecular targets that are being investigated in current drug discovery programmes.

By contrast, SGAG-binding proteins are heterogeneous and difficult to classify11,12. Their ability to recognize SGAGs arises from clusters of cationic amino acids on unrelated proteins that confer the ability to recognize anionic structural motifs in extended SGAG chains. Typically, various SGAG-binding proteins interact with each SGAG with different affinities, and only a few SGAG sequences are exclusively recognized by a single SGAG-binding protein.

Here, we present the most promising drug candidates from the lectin families: selectins and dendritic cell-specific ICAM3-grabbing non-integrin 1 (DC-SIGN; also known as CD209) from the C-type lectin family, myelin-associated glycoprotein (MAG; also known as sialic acid-binding immunoglobulin-like lectin 4A (Siglec 4A)) as an example of an I-type lectin, and PA-I galactophilic lectin (PA-IL), fucose-binding lectin PA-IIL and minor component of type 1 fimbriae (FimH) as representatives of bacterial lectins.

C-type lectins

The hallmark of C-type lectins is the involvement of Ca2+ in the binding of glycans to their carbohydrate recognition domain (CRD). They have a wide range of biological functions, such as intercellular adhesion, serum glycoprotein removal and pathogen recognition.

Selectins. These are perhaps the most intensely studied mammalian carbohydrate-binding proteins. First discovered in 1989 (Refs 13–15), their functions as adhesion molecules are well understood16. The family consists of three members: E-selectin (also known as CD62E), P-selectin (also known as CD62P) and L-selectin (also known as CD62L). They are composed of a Ca2+-dependent CRD, an epidermal growth factor (EGF) domain, various short complement-like consensus repeats, a single transmembrane domain and an intracellular tail. Although carbohydrates bind to a receptor site within the CRD, the neighbouring EGF domain influences binding affinity and specificity17.

The three selectins have overlapping and distinct expression patterns, both temporally and spatially. E-selectin is expressed on endothelial cells by de novo protein synthesis 2–4 hours after stimulation by inflammatory mediators, such as interleukin 1β and tumour necrosis factor-α. P-selectin is expressed on activated platelets and is also stored in Weibel–Palade bodies in endothelial cells, which fuse to the cell surface on activation, leading to the expression of P-selectin within minutes. L-selectin is constitutively expressed by most leukocytes and plays a major part in homing and trafficking of lymphocytes through the blood and lymphatic systems.

All three selectins bind a common carbohydrate domain shared by sialyl Lea/x (sialyl Lewisa (sLea) and sialyl Lewisx (sLex))18 Interestingly, both of these carbohydrate sequences were originally discovered as cancer-associated antigens19,20,21 and are prognostic indicators of metastatic disease22. Tumour cells coated with these carbohydrate chains are recognized as migrating leukocytes, allowing them to escape the bloodstream and metastasize to other organs and tissues, such as the lymph nodes and bone marrow23,24.

To functionally bind sialyl Lea/x in vivo, both P- and L-selectins require additional interactions with negatively charged sulphate groups, either on the carbohydrate chain itself or on an adjacent peptide sequence. E-selectin has no such requirement and can functionally bind sialyl Lea/x in glycolipids25 and glycoproteins26.

The involvement of negatively charged groups, such as sulphates and carboxylates, in the binding of L- and P-selectin has led to one of the major pitfalls in designing small-molecule inhibitors for the selectins. A wide range of structurally diverse, negatively charged molecules has been reported to bind P- and L-selectins. These include sulphatides27, heparins28, fucoidan29, sulphated dextran30, chondroitin sulphate31, dermatan sulphate32, tyrosine sulphates33, sulphated hyaluronic acid34 and sulphogalabiose35. Such a range of molecules suggests that their inhibitory activity is due to nonspecific negative-charge interactions. In fact, a cautionary publication36 described potent P-selectin activity found in trace contaminants of polyanions from ion exchange media used in the preparation samples. Thus, the specificity of small-molecule, highly charged selectin antagonists that inhibit P- and L- but not E-selectin must be carefully evaluated.

In diseases in which cell adhesion, extravasation of cells from the bloodstream or the migration of specific lymphocytes has been implicated in the pathology, selectins present an attractive therapeutic target. For example, E- and P-selectins have been shown to mediate the acute adhesion and aggregation of leukocytes and erythrocytes during a vaso-occlusive crisis in a mouse model of sickle cell disease37,38. Furthermore, aberrant extravasation of cells from the bloodstream is the hallmark of many inflammatory diseases (such as asthma, colitis, arthritis and psoriasis) and cancer. Tumour cells that extravasate out of the bloodstream use the selectin pathway to metastasize. Many solid tumours and adenocarcinomas, such as gastrointestinal39, pancreatic40, breast41, lung42 and prostate43 cancers, express high levels of sLex and sLea. Expression of these selectin ligands on the tumour cells of patients with gastric and colon cancers44 is significantly correlated with poor survival22. Cimetidine (Tagamet; GlaxoSmithKline), a histamine receptor antagonist that also suppresses vascular expression of E-selectin, markedly and specifically improved survival of high-risk patients identified by tumour expression of sLea and sLex (Ref. 45), further supporting the usefulness of selectins as therapeutic targets for cancer.

Selectins and their ligands have also been reported to play key parts in the dissemination of haematological cancers46 and the homing of leukaemic stem cells to microdomains within the bone marrow47. E-selectin is constitutively expressed in the bone marrow48 and binds carbohydrate ligands that are found on leukaemic stem cells. Once adherent to these microdomains in the bone marrow, leukaemic cells become quiescent and less susceptible to killing by anti-proliferative chemotherapy drugs such as cytosine arabinoside49. Potent selectin antagonists present new therapeutic opportunities for treating these diseases. By preventing sequestration of leukaemic cells in the bone marrow and keeping them in circulation, combination therapy with selectin antagonists is likely to make the cells more susceptible to chemotherapy. Some examples of glycomimetic, small-molecule antagonists of the selectins are presented in Table 1.

Table 1 Small-molecule selectin antagonists in preclinical and clinical trials

DC-SIGN. Mucosal surfaces present barriers to the environment that are potentially susceptible to infection. Migrating dendritic cells guard mucosal surfaces, capturing microorganisms and presenting processed antigens to activated T cells, thereby inducing an immune response against the invading pathogens. By screening a library of dendritic cell-specific monoclonal antibodies that inhibit binding to intercellular adhesion molecule 3 (ICAM3; an adhesion molecule that activates T cells), a single cell surface protein was discovered: DC-SIGN50.

The amino-acid sequence of DC-SIGN is identical to a previously described HIV glycoprotein 120 (gp120)-binding C-type lectin51,52. DC-SIGN that is expressed on patrolling dendritic cells in the mucosa binds to carbohydrate structures on the gp120 protein coat of HIV, which is the initial entry port of HIV to the host. HIV particles bound to DC-SIGN on the surface of dendritic cells are protected from destruction in the blood and migrate to the lymph nodes where they trans-infect T cells through the CD4–CCR5 (CC-chemokine receptor 5) complex on the T cell surface51. The binding specificity of DC-SIGN is for fucose and mannose residues, with higher affinity and specificity for the fucose linkage in Lea/x-type oligosaccharide structures. Formation of the active structure and binding of DC-SIGN occurs in a Ca2+-dependent manner52,53.

In addition to HIV, various other pathogens — such as the hepatitis C virus54, Dengue virus55, Ebola virus56, Marburg virus57, coronavirus (which causes severe acute respiratory syndrome)58 and cytomegalovirus59, as well as bacteria such as Mycobacterium tuberculosi s60 and Helicobacter pylor i52 and yeast (Candida albicans) — exploit DC-SIGN to infect their host. More recently, even parasites such as Leishmani a spp.61 and Schistosoma manson i62 have also been shown to bind DC-SIGN.

The fact that different pathogens have capitalized on this infection strategy makes DC-SIGN an interesting target for therapeutic intervention. In a study on the binding and transfer of HIV in human rectal mucosa cells, more than 90% of bound virus was bound to cells expressing DC-SIGN, although these cells represented only 1–5% of the total mucosal mononuclear cells. Furthermore, DC-SIGN-specific antibodies blocked more than 90% of HIV binding63. Other studies have shown that multivalent glycoconjugates of Lewisx or D-mannose prevented the attachment of Ebola or herpes virus to dendritic cells through DC-SIGN and thus prevented the subsequent infection of immune cells64,65,66.

Glycomimetic compounds that inhibit DC-SIGN are based on two lead structures. The first are high-mannose oligosaccharides and the second is L-fucose as part of a Lewis epitope67. These determinants are synthesized by pathogens to camouflage their appearance as host tissue. To improve the affinity and pharmacokinetic properties of these naturally occuring antagonists, glycomimetics of both types of ligands have been synthesized.

High-density arrays of unbranched Manα(1-2)Man-terminated oligosaccharides bind to DC-SIGN almost as effectively as the entire Man9 oligosaccharide (Ref. 68). Therefore, the non-reducing end Manα(1-2)Man fragment of Man9 was suggested to play a crucial part in DC-SIGN recognition. To mimic 1,2-mannobiose, one hexose moiety was replaced by a cyclohexanediol derivative, leading to the pseudo-1,2-mannobioside compound 1 (Fig. 2), which had a threefold greater affinity for DC-SIGN than did 1,2-mannobiose (half-maximal inhibitory concentration (IC50) = 0.62 mM and 1.91 mM, respectively)69. Furthermore, in infection studies using an in vivo model of Ebola infection, the glycomimetic compound 1 inhibited infection of DC-SIGN-expressing Jurkat cells more efficiently than the corresponding natural disaccharide. Although the inhibitory concentration in these experiments was in the millimolar range, compound 1 might be useful in the preparation of high-affinity multivalent antagonists. Such an approach is encouraged by the strong inhibitory effects of multivalent antagonists on DC-SIGN binding, as observed for dendritic mannose conjugates70 or oligolysine-based oligosaccharide clusters71.

Figure 2: Ligands of dendritic cell-specific ICAM3-grabbing non-integrin 1 (DC-SIGN).
figure 2

DC-SIGN co-crystallized with the natural epitopes Galβ(1-4)[Fucα(1-3)]GlcNAcβ(1-3)Gal (Protein Data Bank code 1SL5) (a) and Manα(1-6)[Manα(1-3)]Manα(1-6)Man (PDB code 1SL4) (b). The protein backbone is depicted in ribbon style, carbohydrates are shown in ball and stick style and the grey sphere is Ca2+. Part c shows the structures of DC-SIGN antagonists. The glycomimetics compound 1 (Ref. 69) and compound 2 (Ref. 72) have only a slightly improved affinity compared with the natural ligands Manα(1-2)Man68 and Lewisx (Ref. 72), whereas the non-carbohydrate antagonists compound 3 and compound 4 have half-maximal inhibitory concentration (IC50) values in the low micromolar range73.

Similarly, α-fucosylamine linked to 2-aminocyclohexanecarboxylic acid (compound 2) mimics Lewisx trisaccharide and inhibits DC-SIGN with a twofold greater potency (IC50 = 0.35 mM and 0.8 mM, respectively)72. These binding affinities are too weak for these compounds to have any therapeutic promise; however, when the oligosaccharides are displayed on large multivalent dendrimers, activity is greatly improved and biological activity can be shown in vitro71. Although such large multivalent presentations of carbohydrates or mimics thereof are a relatively simple means to increase activity, they pose a pharmaceutical challenge in terms of routes of administration and possible side effects, such as unwanted immune responses.

A classical approach to discovering DC-SIGN antagonists was successfully demonstrated by screening large libraries of small molecules in an automated assay format. By screening over 35,000 compounds, 7 hits with IC50 values in the low micromolar range were identified, such as compound 3 and compound 4 (Ref. 73). Interestingly, the structures of these hits bear no resemblance to the native carbohydrate ligands of oligomannose or the Lewis epitopes and do not contain functional groups to interact with Ca2+ in the CRD. Their inhibitory activity could be caused by binding to other domains on DC-SIGN, leading to an allosteric effect.

I-type lectins

I-type lectins are a family of carbohydrate-binding proteins in the immunoglobulin superfamily, and include Siglecs74. The Siglecs function as cell signalling co-receptors and are primarily expressed on leukocytes that mediate acquired and innate immune functions. The cytoplasmic domains of most Siglecs contain immunoreceptor tyrosine-based inhibitory motifs, which are characteristic of accessory proteins that regulate transmembrane signalling and endocytosis of cell surface receptor proteins. The diverse specificity for their sialoside ligands and variable cytoplasmic regulatory elements enable Siglecs to carry out unique roles at the cell surface. Siglecs can be broadly divided into an evolutionarily conserved group (Siglec 1 (also known as sialoadhesin), Siglec 2 (also known as CD22) and Siglec 4 (also known as MAG)) and a Siglec 3-related group (Siglec 3 and Siglecs 5–13). The evolutionarily conserved group shows selective binding properties: Siglec 1 and MAG preferentially bind α(2-3)-linked N-acetylneuraminic acid (Neu5Ac) and Siglec 2 is highly specific for α(2-6)-linked Neu5Ac. By contrast, members of the Siglec 3-related group are more promiscuous in their binding, often recognizing more than one presentation of Neu5Ac.

The most comprehensively characterized Siglecs are Siglec 2, a regulatory protein that prevents the overactivation of the immune system and the development of autoimmune diseases, and MAG, a protein that blocks regeneration of the central nervous system (CNS) after injury75.

MAG. Unlike the peripheral nervous system (PNS), the injured adult CNS inherently lacks the capacity for axon regeneration. Although neurite outgrowth is possible in principal, it is blocked by inhibitor proteins expressed on residual myelin and on astrocytes that are recruited to the site of injury. To date, three major inhibitor proteins have been identified: reticulon 4 (RTN4; also known as nogo A)76, myelin oligodendrocyte glycoprotein (MOG)77 and MAG78. These three proteins bind to and activate the RTN4 receptor, which is located on the surface of the neuron. This leads to the formation of a complex with the nerve growth factor receptor (NGFR; also known as p75NTR) and the activation of the RhoA–ROCK (Rho-associated, coiled coil-containing protein kinase) cascade, which results in growth cone collapse79.

The RhoA–ROCK inhibitory cascade can also be triggered by a complex formed by MAG, brain gangliosides (especially GM1b, GD1a, GT1b, GT1β and GQ1bα)80 and NGFR81. Although the exact biological role of the MAG–ganglioside interaction has yet to be resolved, in some systems inhibition of axon regeneration by MAG could be completely reversed by sialidase treatment, suggesting that sialidated glycans are the main axonal ligands of MAG82. SAR studies83,84,85 have revealed that the terminal tetrasaccharide epitope Neu5Acα(2→3)-Galβ(1→3)-[Neu5Acα(2→6)]-GalNAc of GQ1bα shows superior binding to MAG compared with the terminal trisaccharide epitope, which is present in GD1a and GT1b, for example86. Further refinements of the SAR profile have led to the identification of MAG antagonists that have improved affinities and, at least in some cases, remarkably simple structures (Fig. 3). However, owing to the use of different assay formats, it has been difficult to compare the reported affinities of these compounds for various ligands.

Figure 3: Myelin-associated glycoprotein (MAG) antagonists.
figure 3

a | MAG, nogo 66 and myelin oligodendrocyte glycoprotein (MOG) bind to the reticulon 4 receptor (RTN4R; also known as the nogo receptor). The inhibitory signal is transduced into the cytosol of the neuron through the co-receptor NGFR (nerve growth factor receptor; also known as p75NTR). MAG bound to the brain gangliosides GD1a, GT1b and GQ1bα also transduces the inhibitory signal, with the help of NGFR as a co-receptor, into the cytosol79. b | GQ1bα is the brain ganglioside with the highest affinity for MAG80; replacement of its inner sialic acids by sulphates (to produce compound 5) led to a fourfold increase in affinity165. The tetrasaccharide compound 6 (Ref. 87) is the minimal carbohydrate epitope of GQ1bα for MAG binding and has served as a lead structure for the development of antagonists; with compound 7, an excellent correlation between the degree of neurite outgrowth and the binding affinities was established88. Further modifications involved the replacement of the Galβ(1-3)GalNAc core (to produce compound 8 (Ref. 166)) or the α(2-6)-linked Neu5Ac (to produce compound 9 (Ref. 135)). Following studies on compound 10 (Ref. 167), numerous Neu5Ac derivatives168,169, for example, compound 11, with up to nanomolar affinities have been synthesized. Affinity data of the different compounds should be compared with caution as they were obtained from different assays. IC50, half-maximal inhibitory concentration; Kd, dissociation constant; RIP, relative inhibitory potency.

Overall, starting from the low-affinity tetrasaccharide lead structure compound 6 (Ref. 87), low-molecular-mass MAG antagonists with nanomolar affinity and excellent stability in the spinal cord fluid have been identified (S. Mesch, D. Moser, A. Vedani, B. Cutting, M. Wittwer, H. Gäthje, S. Shelke, D. Strasser, O. Schwardt, S. Kelm and B. Ernst, unpublished observations). The high correlation between the degree of neurite outgrowth and the binding affinities of these antagonists further validates MAG as a therapeutic target and suggests that potent glycan inhibitors of MAG have the potential to enhance axon regeneration88.

Bacterial and viral lectins

For colonization and subsequent development of an infectious disease, enteric, oral and respiratory bacteria require adhesion to the host's tissue. This grants them a substantially greater resistance to clearance and killing by immune factors, bacteriolytic enzymes and antibiotics. In addition, such bacteria are better able to acquire nutrients, further enhancing their ability to survive and infect the host. Therefore, anti-adhesive drugs that prevent the adhesion of pathogens to host tissues may offer a novel strategy to fight infectious diseases89. The alarming increase in drug-resistant bacterial pathogens makes a search for new approaches to fight bacterial infections essential90. Because anti-adhesive agents are not bactericidal, they are less likely to promote the propagation of resistant strains than bactericidal agents, such as antibiotics.

The carbohydrate epitopes on the surface of host cells that are used by bacteria and viruses for colonization and infection (Table 2) are the starting point of the search for glycomimetic entry inhibitors.

Table 2 Carbohydrate epitopes used by bacteria and viruses for recognition and entry

A challenge of anti-adhesion therapy is that most pathogens possess genes encoding several types of adhesins, so that, during the infection process, they may express more than one of these adhesins. Glycomimetic antagonists that are designed to inhibit multiple adhesins are feasible to develop, and examples are described below for Pseudomonas aeruginosa.

P. aeruginosa virulence factors (PA-IL and PA-IIL). P. aeruginosa can be part of the normal flora in healthy adults but becomes a deadly pathogen in individuals who are immunocompromised, patients with cystic fibrosis and hospitalized, critically ill patients. An increasing percentage of P. aeruginosa infections are antibiotic resistant.

For its adhesion to host cells, the pathogen expresses lectins such as PA-IL and PA-IIL91. These lectins are virulence factors under quorum sensing control and are, by themselves, cytotoxic to primary epithelial cells in culture92. At low concentrations, they inhibit ciliary beating of epithelial cells in explants of nasal polyps93. Inhibition can be completely reversed by treatment with the carbohydrate ligand of the lectin. Thus, 24 hours after addition of fucose, ciliary beating returns to normal frequency94.

PA-IL and PA-IIL are tetrameric lectins that require Ca2+ for carbohydrate binding. The crystal structures of both lectins complexed with their carbohydrate ligands have been resolved (Fig. 4). PA-IL preferentially binds to terminal α-linked D-galactose in the presence of one Ca2+ ion, whereas PA-IIL binds with an unusually strong micromolar affinity to L-fucose and requires two Ca2+ ions95,96. PA-IL and PA-IIL are soluble intracellular lectins. However, once released from the cells, these lectins cause bacteria to adhere to host tissue — a process that can be reversed by incubation with D-galactose and D-mannose, respectively97.

Figure 4: PA-IL and PA-IIL inhibitors.
figure 4

a | Binding sites of PA-I galactophilic lectin (PA-IL) complexed with D-galactose (Protein Data Bank code: 1OKO (Ref. 170)). b | Binding sites of fucose-binding lectin PA-IIL complexed with L-fucose (PDB code: 1GZT143). In parts a and b, the protein backbones are depicted in ribbon style, carbohydrates are shown in ball and stick style and the grey spheres are Ca2+. c | The monovalent ligands compound 12 (Ref. 101) and compound 13 (Ref. 99) exhibit affinity for PA-IL and PA-IIL that is similar to that of Lewisa (Ref. 100); the most potent oligovalent ligand is compound 14 (Ref. 102), but it has only a modest effect on a per saccharide basis; the heterobifunctional glycodendrimer compound 15 (Ref. 104) and the low-molecular-mass glycomimetic compound 16 (Ref. 105) bind to both PA-IL and PA-IIL from Pseudomonas aeruginosa.

The native carbohydrate inhibitors of PA-IL and PA-IIL, D-galactose and L-fucose, were successfully used to treat a tobramycin-resistant P. aeruginosa infection in a case report98. Combination therapy of tobramycin with D-galactose and L-fucose to inhibit the virulence factors PA-IL and PA-IIL cured an 18-month-old infant with systemic and pulmonary infections, as determined by microbiological testing.

Screening with the glycan arrays of the Consortium for Functional Glycomics revealed that the Lewisa trisaccharide, Galβ(1-3)[Fucα(1-4)]GlcNAc, is a high-affinity ligand for PA-IIL99, with a dissocation constant of 210 nM100. To reduce the complexity of the trisaccharide antagonists, glycomimetics based on the Fucα(1-4)GlcNAc disaccharide — for example, the antagonist compound 12 — were synthesized. By titration calorimetry experiments, increased entropy costs upon binding were detected as a result of the higher flexibility of Fucα(1-4)GlcNAc compared with Lewisa. However, additional enthalpic interactions that originate from a network of hydrogen bonds compensate for this entropic penalty101. A further simplification of the PA-IIL antagonists was achieved when α-L-fucosides bearing heterocyclic substituents as aglycons were synthesized. Surprisingly, some candidates — for example, compound 13 —have a similar potency to Lewisa (Ref. 99).

Oligovalent forms of the Fucα(1-4)GlcNAc epitope, such as compound 14 (Ref. 102), exhibit increased activity compared with monovalent forms; however, in most cases, this effect was only modest on a per saccharide basis. To date, multivalency has only been explored with dendrimers that present L-fucose, which show an increase in affinity of up to a factor of 20 on a per saccharide basis103. Finally, to prevent adhesion of P. aeruginosa mediated simultaneously by the PA-IL and PA-IIL lectins, heterobifunctional ligands that present both D-galactose and L-fucose in an oligovalent array (as in compound 15 (Ref. 104)) or as a small-molecule glycomimetic (as in compound 16 (Ref. 105)) have been constructed. In a study to determine the efficacy of compound 16 in mice surgically stressed by 30% hepatectomy, 60% of the control group died 48 hours after acute infection with P. aeruginosa, whereas 100% of mice treated with compound 16 survived105.

FimH. Urinary tract infections (UTIs) are among the most prevalent inflammatory diseases that are caused by pathogens106,107. The predominant pathogen in UTIs is uropathogenic Escherichia coli (UPEC), which causes more than 80% of all infections in otherwise healthy people (uncomplicated UTI). In healthy individuals, most uropathogens originate from the rectal microbiota and enter the normally sterile urinary bladder through the urethra, where they trigger the infection (cystitis). Once in the urinary tract, bacteria attach to the urinary tract epithelium through fimbrial adhesion molecules to avoid the host's defence mechanisms. Once bound, the bacteria are presumably internalized in an active process that is similar to phagocytosis108.

Uncomplicated UTI can be effectively treated with oral antibiotics such as fluoroquinolones, cotrimoxazol or amoxicillin and clavanulate, depending on the susceptibility of the pathogen involved. However, recurrent infections and subsequent antibiotic exposure can result in the emergence of antimicrobial resistance, which often leads to treatment failure and reduces the range of therapeutic options. So, there is an urgent need for efficient, cost-effective and safe non-antibiotic therapy to prevent and treat UTIs without facilitating antimicrobial resistance. Inhibition of type 1 fimbriae-mediated bacterial attachment to the bladder epithelium is a promising approach to achieve this goal109. Studies showed that α-mannosides are the primary bladder cell ligands for UPEC and that the attachment event requires the highly conserved FimH lectins, which are located at the tip of the bacterial fimbriae. A structure–function analysis showed that the residues of the FimH mannose binding pocket are invariant across 200 UPEC strains110.

More than two decades ago, various oligomannosides111 and aromatic α-mannosides112 that antagonize type 1 fimbriae-mediated bacterial adhesion were identified. Two approaches have been taken to improve their affinity: the rational design of ligands guided by information obtained from the crystal structure of FimH, and the multivalent presentation of the α-mannoside epitope.

The crystal structure of the FimH receptor-binding domain was solved in 1999 (Ref. 113) and the corresponding complex with oligomannoside-3 (Ref. 114) has recently become available. Despite this detailed knowledge of the binding event, few attempts to translate this information into low-molecular-mass antagonists have been reported112,115,116,117. A selection of monovalent FimH antagonists is depicted in Fig. 5. The reference compound, methyl α-D-mannoside (compound 17) binds in the millimolar range118, but the most potent monovalent antagonist reported so far, compound 22, binds with nanomolar affinity117.

Figure 5: FimH antagonists.
figure 5

The crystal structure representation shows the mannose derivative compound 21 docked to the mannose-binding pocket of FimH (Protein Data Bank code: 1KFL). The relative inhibitory potencies (RIPs) of the FimH antagonists compounds 18 to 28are based on methyl α-D-mannoside (compound 17; RIP = 1). As the RIPs were obtained from different assays (yeast agglutination, adherence to cell lines derived from human urinary bladder epithelium or guinea pig epithelial cells as well as surface plasmon resonance experiments with immobilized FimH), they should be compared with caution.

The reported affinities can be explained on the basis of the structure of the CRD that is located on the tip of the FimH protein (Fig. 5). First, the hydroxyl groups at the 2, 3, 4 and 6 positions of mannose form an extended hydrogen bond network114,118. Second, the entrance to the binding site formed by two tyrosines and one isoleucine — the so-called 'tyrosine gate' — supports hydrophobic contacts118. The aromatic aglycons of antagonists — as occur in compounds 20 and 21, for example — can establish energetically favourable π–π interactions with this tyrosine gate, leading to substantially improved affinities. A further enhancement of affinity was achieved using oligovalent and multivalent FimH antagonists (for example, compounds 23 to 28).

Soluble FimH antagonists that are applied to prevent bacterial adhesion to the host tissue are faced with the challenge of mechanical forces resulting from fluid flow. It is commonly presumed that the duration of receptor–ligand interactions is shortened by shear stress. However, it was recently discovered that the ability of E. coli to avoid detachment is dramatically increased by shear stress119. As a consequence of shear stress-enhanced adhesion, E. coli evades detachment from body surfaces by soluble glycoproteins or peptides that are ubiquitous in body fluids. An example is the glycoprotein uromodulin (also known as the Tamm–Horsfall urinary glycoprotein), which binds to FimH and is thought to function as a body defence against E. coli infections120. On the basis of simulations121, it is thought that force-induced separation of FimH from its mannose ligand causes a conformational change of the binding pocket from a low-affinity to a high-affinity conformation. Instead of the application of competitive antagonists, allosteric antagonists that are capable of stabilizing the low-affinity conformation might lead to a successful therapy.

Although monovalent and oligovalent antagonists with nanomolar affinity have been reported, there are no data available regarding their pharmacokinetic properties. However, for the treatment of UTI, oral bioavailability and fast renal excretion to reach the targets in the urinary tract are prerequisites for therapeutic success.

Rational design: challenges and lessons learned

As in other fields that have spawned successful new therapeutics (for example, monoclonal antibodies), years of effort have been required to understand the unique challenges that are inherently linked to carbohydrate-derived drugs and to develop the basic skills and the specific knowledge to move from the excitement of scientific discovery to the development of a new class of therapeutics.

Although animal lectins usually show a high degree of specificity for glycan structures, their single-site binding affinities are typically low. In biological systems, functional affinity is often attained by the oligovalent presentation of CRDs, either in an oligomeric protein (for example, cholera toxin122) or through clustering at cell surfaces (for example, asialoglycoprotein receptor123). Additionally, the pharmacokinetic properties of carbohydrate hits, such as bioavailability or plasma half-life, are typically unsatisfactory for therapeutic applications. Finally, although tremendously improved novel glycosylation protocols124 and solid-phase approaches125 have become available, oligosaccharides are still only manufactured by cumbersome multi-step syntheses.

Therefore, the challenge is to mimic the structural information of a functional carbohydrate with a compound that has drug-like characteristics. The first step in this process is to understand the SAR of a carbohydrate lead, specifically the contribution made by each functional group to binding as well as the three-dimensional presentation of the pharmacophores. Based on this information, it is possible to identify glycomimetics that are pre-organized in their bioactive conformation — that is, which will adopt their bound conformation in solution. In addition, the mimics should show improved pharmacokinetic properties — in particular, improved bioavailability and serum half-life — while minimizing toxicity and cost of synthesis. In the past, the development of carbohydrate-derived drugs was often not entirely focused on simultaneously solving all of the above requirements and some high-profile failures resulted, notably in the field of selectin antagonists. Nevertheless, rationally designed glycomimetics have the potential to reap the rewards of a relatively untapped source of novel therapeutics for wide-ranging and important biological and medical applications.

Understanding native interactions. The starting point for the rational design of glycomimetics is the analysis of the binding characteristics of the carbohydrate–CRD binary complex. The three-dimensional structure of the lectin or the carbohydrate–lectin complex has been solved for a number of therapeutically interesting targets. Thus, E-, P- and L-selectin co-crystallized with sLex or PSGL1 (P-selectin glycoprotein ligand 1)126, sialoadhesin co-crystallized with 3′-sialyl lactose127, or DC-SIGN co-crystallized with the pentasaccharide GlcNAc2Man3 (Ref. 128) hold valuable information for the rational design of glycomimetics. In cases in which the structure has not yet been solved, homology models can be generated — as is the case for MAG, for example129.

Detailed insight into the binding event can be gained by nuclear magnetic resonance (NMR) experiments. For example, the bound conformation of a functional carbohydrate ligand in the CRD of the target lectin can be determined using transferred nuclear Overhauser effect (NOE)130. In addition, the binding epitope can be identified by saturation transfer difference NMR spectroscopy (STD NMR spectroscopy)131. This technique has been used to study interactions of carbohydrate ligands with the rotavirus receptor, VP8 (Ref. 132), the anti-carbohydrate tumour-associated antibody GSLA1 (Ref. 133), E-selectin134 and MAG87,135. Overall, transfer NOE NMR and STD NMR experiments allow a rapid insight into the binding characteristics of carbohydrate–lectin interactions and can replace, at least partially, X-ray investigations and the time-consuming mapping of binding epitopes by chemical means136.

Enhancing binding affinity. The generally low affinity of carbohydrate–lectin interactions is a consequence of shallow binding sites of lectins, leading to a high solvent accessibility of the complex forming hydrogen bonds and salt bridges. Owing to large off-rates (koff), the binary complexes are characterized by short dissociative half-lives (t1/2), typically in the range of seconds — as shown for selectins and their physiological ligands137,138,139, the carbohydrate-recognizing antibody GSLA1, sLea (Ref. 133) and MAG antagonists135. Given that, for a therapeutic application, the t1/2 of a drug–target binary complex is expected to be in the range of minutes to a few hours, improving the koff of glycomimetic compounds is mandatory for therapeutic applications140.

Often, mammalian lectins undergo numerous directed, but weak, interactions with their ligands. A specific example, the interaction of sLex with E-selectin, is outlined in Fig. 6a. It consists of six solvent-exposed hydrogen bridges and a salt bridge (to produce complex 29). One possible approach to improve affinity is to pre-organize the antagonist in its bioactive conformation to compensate for the low enthalpic contributions by reducing the entropy costs on binding. For E-selectin, this strategy was successful (see complex 30 in Fig. 6a). As elucidated by X-ray126 or STD NMR134 studies, the GlcNAc moiety does not interact with the binding site and serves solely as a linker that positions the galactose and the fucose moiety in the correct spatial orientation. It was successfully replaced by non-carbohydrate linkers141,142. In addition, steric repulsion deriving from properly placed substituents on the linker moiety can further improve the pre-organization of the core and, as a result, the affinity of the corresponding antagonist130. Furthermore, the pre-organization of the carboxylate was optimized as well, revealing (S)-cyclohexyl lactic acid as the best mimic of Neu5Ac141.

Figure 6: Enhancing the affinity of carbohydrate-derived drugs.
figure 6

a | The affinity of carbohydrate-derived drugs can be improved by pre-organization in the bioactive conformation. In solution, the core conformation (shown in red) of sialyl Lewisx is in the range of +10° to −60° and the acid orientation (shown in blue) is in the range of +80° to +150°. In the bioactive conformation (complex 29), the core conformation is approximately −40° and an acid orientation is approximately 110° (Refs 175–178). The degree of pre-organization of a mimetic in the bioactive conformation, as shown in complex 30, can be correlated with its affinity130,141. b | Affinity can be improved by establishing new enthalpic interactions; comparisons of the binding mode of Neu5Ac2en (compound 31), zanamivir (Relenza)7 and oseltamivir (Tamiflu)9 to neuraminidase are depicted. bb, backbone; sc, side chains.

If the target lectin offers a well-structured binding pocket, the free energy of binding can be improved by incorporating additional enthalpic contributions. Successful examples are the neuraminidase inhibitors zanamivir7 and oseltamivir9. For the influenza viral coat protein neuraminidase, the natural substrate Neu5Ac and the corresponding glycal Neu5Ac2en (compound 31), which mimics the transition state of the hydrolytic reaction, have only millimolar to micromolar affinities. The improved affinities of the transition state analogues zanamivir and oseltamivir result from a guanidinium substitution in the 4 position, enabling the formation of a new salt bridge7, or from the replacement of the glycerol side chain in the 6 position, leading to a new, favourable lipophilic interaction by induced fit9 (Fig. 6b).

Finally, multivalency frequently occurs in nature and leads to tight binding in situations in which univalent protein–ligand binding is weak143,144,145. Recognition of carbohydrate ligands by bacterial and mammalian lectins are examples of this phenomenon. For the specific inhibition of these recognition events, oligovalent ligands have been proposed (see, for example, Figs 4,5). However, the design of tight-binding oligovalent ligands is, for the most part, an empirical endeavour. Tailored oligovalency, whereby the spacing of a limited number of tethered branches is matched to that between adjacent sugar binding sites of a protein or a protein cluster, potentially offers substantial increases in avidity for the target143,146,147.

Pharmacokinetics. Unfortunately, only limited pharmacokinetic data are reported for any carbohydrate or glycomimetic. For oral absorption by passive permeation through the membrane barrier of the small intestine148, there are limitations regarding molecular mass, polarity and the number of hydrogen bridge donors and acceptors149. The hydrophilic nature of oligosaccharides caused by the large number of hydroxyl groups and charges (sulphates and carboxylates) makes their oral availability virtually impossible. Therefore, when glycomimetics are designed, the pharmacokinetic as well as the pharmacodynamic profile should be adjusted. Possible strategies to improve passive absorption are the bioisoteric replacement of crucial groups150 or a prodrug approach151. A successful example of the prodrug approach is oseltamivir, which is an ester prodrug. Once absorbed, the ester is metabolized to the corresponding carboxylate, the active metabolite RO64-0802 (Ref. 152). Its absolute bioavailablity from the orally administered prodrug is 80%. It is detectable in plasma within 30 minutes and reaches maximal concentrations after 3–4 hours153.

In addition, the feasibility of using an active-transport system that is abundant in the intestine, liver, kidney or brain should also be considered154. Many drugs that are rationally designed or derived from natural products that cannot be absorbed by passive transport (such as β-lactam antibiotics, heart glycosides or fungicides) take advantage of active transport. In addition, active transport can be enforced by rational design — for example, by incorporating an amino acid into the structure and thereby creating a substrate for active transport by peptide transporter 1 (PEPT1; also known as SLC15A1) and PEPT2 (also known as SLC15A2). A successful example is valacyclovir (Valtrex/Zelitrex; GlaxoSmithKline), an antiviral drug used in the management of herpes simplex, in which valine was attached to the parent drug acyclovir (Zovirax; GlaxoSmithKline/Biovail), leading to a fivefold increase of the oral availability155. Extensive analysis of the structural requirements of the PEPT1 transporter identified numerous analogues with higher affinity than valine; this information will be valuable for improving the oral availability of glycomimetics156.

The usually short serum half-life and rapid excretion of carbohydrates presents an additional challenge for the design of glycomimetic drugs. Degradation in the presence of serum or liver microsomes are routine assays of metabolic stability that must be incorporated early in the design process of glycomimetics157.

Organic anion and cation transport systems located in the liver and kidney are responsible for active excretion from the circulation158. The organic anion transporter family (OAT1 to OAT5) recognizes anions (specifically, carboxyl groups) connected to hydrophobic ring structures. RO64-0802, the active metabolite that is formed from oseltamivir, is an example of a glycomimetic drug with a serum half-life that is diminished by recognition and removal by the OAT system159. When probenecid, a competitive inhibitor of OAT1, is administered in combination with oseltamivir, the serum half-life of the active metabolite is extended160. This strategy has been suggested to extend the supply of the US government's stockpile of oseltamivir in case of a national emergency in response to a pandemic outbreak of influenza. Both interactions with probenecid and specific transporter assays should be examined early in the development of a glycomimetic containing charged groups to identify structural elements that may adversely affect serum half-life.

Conclusions

Recent efforts to elucidate the complexity and functions of the human glycome by pooling resources and technologies among academic centres has led to a rapid influx of discoveries and the acknowledgement of a new source of structural information that is not apparent from the human genome. The efforts in drug discovery reviewed here show the challenges in medicinal chemistry that need to be met for the development of drug-like glycomimetics.

Past efforts in this field have highlighted the drawbacks of using native oligosaccharides as drugs. Typically, both their pharmacodynamic and pharmacokinetic properties are insufficient for a therapeutic application. In addition to the lack of affinity, they suffer from low tissue permeability, short serum half-life and poor stability. Glycomimetics are designed to correct these shortcomings. The detailed insight into carbohydrate–lectin interactions that is required is predominantly provided by recent progress in NMR spectroscopy and X-ray crystallography. Thus, the identification of the bound conformation of a functional carbohydrate by transferred NOE NMR allows the design of mimetics with pharmacophores that are pre-organized in their bioactive conformation, leading to reduced entropy costs upon binding. By incorporating additional binding sites, which frequently leads to hydrophobic contacts, a further enhancement of affinity can often be achieved. Finally, the knowledge of the binding epitope as obtained by STD NMR allows the identification of negligible and replaceable functional groups. As a consequence, the design of glycomimetics that have improved absorption, distribution, metabolism and excretion can be accomplished.

Currently, these principles for the rational design of glycomimetics are being implemented in both academic institutions and industrial laboratories. As successful examples of glycomimetic drugs emerge, the strategies developed for their design will pave the way to realize the potential of this relatively untapped source of therapeutics.