Introduction

Nucleoside analogues have been arguably the most important class of small molecules for the advancement of modern medicine with widespread uses as life-saving treatments for cancer and viral infections1,2,3,4,5. Within this domain, C4ʹ-modified nucleoside analogues constitute a popular subclass and are currently of immense interest for the development of antiviral small molecules and oligonucleotide-based therapeutics6,7,8,9,10,11,12,13,14,15,16,17. For instance, Chang recently disclosed a novel clinical candidate CL-197 (1) with nanomolar activity (EC50 = 0.9 nM) and low cytotoxicity for treating HIV-1 infection7. In another high-profile example, Merck Inc. has advanced Islatravir (3), a nucleoside reverse transcriptase translocation inhibitor (NRTTI), into Phase III clinical trials for treating HIV9,10,11,12,13. Here, the ethynyl modification at C4ʹ is essential for increased binding interactions in the hydrophobic pocket of the HIV-1 reverse transcriptase (RT) and for modulating sugar ring conformation to hinder DNA synthesis6. The continued interest in C4ʹ-modified nucleoside analogues for clinical development is further evident in the ever-growing number of patents and high-impact research articles appearing in the recent literature. Despite their integral importance to human health over the past several decades, C4ʹ-modified nucleoside analogues remain highly difficult to synthesize. Traditional semi-synthetic approaches are hampered by iterative protection/deprotection sequences that ultimately lead to lengthy 9–16 step sequences with low modularity and poor atom economy17. Such processes are unamenable for diverse library generation in drug discovery as the four contiguous stereocenters that constitute the ribose core of the nucleoside pose a unique synthetic challenge for the efficient manipulation of the ribose backbone in any semi-synthetic strategy (see Fig. 1)8,9,10,11,12,13,14,15,16,17 In fact, a single synthetic route is rarely able to incorporate both different C4ʹ-modifications and nucleobases. In recent years, Merck Inc. has invested considerable resources in addressing some of these shortcomings with the aim of advancing improved process routes for Islatravir (3)9,10,11,12,13. To date, Merck has reported five separate syntheses with each relying on one or more biocatalytic steps. Their efforts culminated in the development of a six-step biocatalytic approach involving the bioengineering of five novel enzymes for the synthesis of Islatravir (see 23)12. More recently, Kaspar reported a remarkable enzymatic trans-glycosylation of 4ʹ-methyluridine (4), which itself is prepared in ten steps from uridine, to access a collection of 4ʹ-methyl nucleoside analogues (5)14. While innovative biocatalytic approaches have become powerful tools for the synthesis of certain analogues, protocols strictly using de novo chemical synthesis are attractive alternatives due to their potential for increased modularity, and are overall more accessible to practitioners15,16,17,18,19,20,21,22. In 2020, a collaborative effort between the Britton Lab and Merck resulted in a significantly shorter 3–4 step de novo synthesis of protected C4ʹ-modified nucleoside analogues15. Though this advance represented an impressive improvement over semi-synthetic approaches, it generates C4ʹ-modified nucleoside analogues in poor-to-good enantioselectivities (66–90 %ee), is only applicable to a small subset of nucleobases, and mostly affords the d-lyxose configuration which is less desirable in drug discovery than the canonical d-ribose configuration. Currently, there is no general platform that provides direct access to libraries of naturally configured C4ʹ-modified nucleoside analogues.

Fig. 1: Nucleoside analogue synthesis.
figure 1

A Traditional and biocatalytic approaches for making C4ʹ-modified nucleoside analogues. B Our five-step de novo synthesis of C4ʹ-modified nucleoside analogues.

Inspired by the works from Britton and MacMillan on their uses of asymmetric aldol reactions for nucleoside synthesis15,16,18, we envisioned our own strategy centred on key scaffold (6) where the ketone and the dimethyl acetal provide two points for structural diversification while simultaneously serving to enable ribose ring formation. In contrast to Britton’s process15,16, which introduces the nucleobase in the first step of their sequence, we instead aimed to install the nucleobase at a late stage following the construction of the modified ribose ring. MacMillan et al. had previously utilized a similar concept in their synthesis of C2ʹ-modified nucleoside analogues where they first constructed the modified ribose core, via a three-step sequence that employed a Mukaiyama aldol between an α-oxyaldehyde and a silyl ketene acetal, followed by glycosylation18. Here, we report a modular five-step process that relies on a unique sequence of intramolecular trans-acetalizations to construct the modified ribose core. Subsequent Vorbrüggen glycosylation provides rapid entry to several C4ʹ-modified nucleoside analogues. This platform addresses many of the aforementioned shortcomings associated with the synthesis of C4ʹ-modified nucleoside analogues (7) to provide a powerful tool for exploring chemical space around this valuable chemotype to support efforts in drug design.

Results

As shown in Fig. 2, an enantioselective aldol reaction between 2,2-dimethoxyacetaldehyde (8; $0.90/g) and 2,2-dimethyl-1,3-dioxan-5-one (9; $2.9/g) afforded the central chiral building block 6 (93–94 %ee)23,24. This step, which utilizes inexpensive and readily accessible starting materials, has been performed on up to 20 g without loss of yield or enantioselectivity allowing us to conveniently stockpile this intermediate. We proceeded to explore 1,2-additions of the ketone functionality in 6 for the introduction of the eventual C4ʹ-modification. With methylmagnesium bromide, this step proceeded smoothly to afford the syn-diol 10 in 69% yield and good diastereoselectivity (5:1), setting the stage for the intramolecular trans-acetalization reaction to construct the modified ribose core 13. Critically, this one-pot sequence relies on the selective deprotection of the dimethyl acetal over the acetonide to unveil the oxocarbenium (11) required for ribose ring formation. Initial attempts here with traditional conditions such as catalytic amounts of Brønsted acids (e.g., AcOH, HCl, TsOH, TFA) were unselective and led to either decomposition of the starting material 10 or a complex mixture of unidentifiable by-products. Other reported conditions such as InCl3/H2O25, I2/acetone26, and TBSOTf/collidine27,28 (entry 5) were also unsuccessful. Following an extensive investigation, we found that by carefully altering the temperature and stoichiometry of TMSOTf and 2,6-lutidine we could effect the desired transformation (entries 3 and 4). To this end, TMSOTf (2 equiv.) and lutidine (1 equiv.) at −10 °C followed by a water quench cleanly afforded the desired product 13 in 66% yield via a remarkable sequence of intramolecular trans-acetalizations. The use of excess TMSOTf relative to lutidine proved critical for executing this reaction. We posited the conversion of 10 to 13 proceeded by acetal deprotection and acetonide migration to afford intermediate 11 where 11 then cyclizes to ribopyranoside 12a via nucleophilic attack by the primary alcohol (5-OH) on the oxocarbenium. Under the Lewis acidic conditions of the reaction mixture, 12a is converted to oxocarbenium 12b which then undergoes a transannular cyclization to form 13. The plausibility of this transannular cyclization in hexose rings is supported by both literature reports29,30 and our own computational work. Using the ωB97X-D/Def2-SVP level of theory31, we computed a low energy barrier of +20 kJ/mol for the cyclization of 12b to 13 indicating this to be a highly feasible transformation at room temperature (see Supplementary Information page 86). Additionally, we carried out a few key experiments to support our mechanistic proposal. We hypothesized acetonide migration must occur prior to cyclization in order for the reaction to proceed. To verify this, we blocked the 2-OH in compound 10 as an OMe (see Supplementary Information page 84) to purposely prevent acetonide migration. Attempts to cyclize this material were unsuccessful as no cyclized products were observed, which strongly suggests that acetonide migration must occur prior to cyclization so that the primary 5-OH can be unveiled. An alternative cyclization (see Fig. 2, pathway B) occurring via the nucleophilic attack by the tertiary alcohol (4-OH) in intermediate 11 was also considered. Pathway B proceeds through the formation of 12c, a compound observed as a minor product in our reaction mixture. Upon isolation of 12c, we re-exposed it to the cyclization conditions, however, no product formation was observed. Thus, 12c is an undesired by-product in the reaction rather than a productive intermediate. As such, pathway B was ruled out based on this key result. To close out the route, TESOTf promoted ring opening/peracetylation of 13 and subsequent Vorbrüggen glycosylation using thymine, bis(trimethylsilyl)acetamide (BSA) and TMSOTf in a mixture of 1,2-DCE and MeCN gave 4ʹ-methylthymidine (15, β-anomer only) in only five total steps. The previous shortest chemical synthesis of this analogue was 13 steps, highlighting the improved efficiency of this process32. To demonstrate its scalability, we then used this route to produce 1.05 g of 15 in a single run without observing significant changes in yield.

Fig. 2: Five-step synthesis of nucleoside analogue 15.
figure 2

A Process development. B Mechanistic investigations into the intramolecular trans-acetalization. aIsolated yields; bCarried out at −10 °C; cCarried out at 0 °C. BSA bis(trimethylsilyl)acetamide.

Enabled by the development of a versatile intramolecular trans-acetalization reaction, we proceeded to evaluate other Grignard reagents and nucleobases for the generation of high-value 4ʹ-modified nucleoside analogues (Fig. 3). Remarkably, this process proved compatible with an array of topical 4ʹ-modifications including methyl (15, 1823), ethyl (24, 25), allyl (26, 27), trideuteromethyl (2831), vinyl (32), and ethynyl (33, 34), that can be conveniently attached with a nucleobase of one’s choosing. We proceeded to generate 4ʹ-analogues of natural products adenosine (20, 26, 28), thymidine (15, 31, 32), and cytidine (22, 25, 29, 34). Non-canonical nucleobases that are in high demand in drug discovery such as 6-methoxy-adenine (18), 2-chloro-adenine (23), 6-chloro-adenine (27), iodouracil (19, 30), and 2-fluoro-adenine (21, 33) were also incorporated in excellent overall yields. In all cases, these syntheses represent roughly a two to threefold improvement in step count over the previous shortest syntheses for 1514,32, 1914, 2014,32, 212514,33,34,35, 3336, and 3436. For instance, during a drug discovery campaign pursuing treatments for RNA-dependent RNA viral infections, Isis Pharmaceuticals synthesized 22 in 14 steps via a semi-synthetic approach starting from diacetone-d-glucose33. In a very recent patent disclosing several DNA damage repair enzyme inhibitors, PrimeFour Therapeutics reported the only synthesis of 21 in 16 steps34. Even in comparison to routes that employed newly bioengineered enzymes, our sequence proved to be over twofold shorter (i.e., 15, 19, 20, and 23). For example, 23 was previously synthesized in 11 steps using a biocatalytic trans-glycosylation of 4ʹ-methyluridine with 2-chloroadenine14. The novel compounds (18, 2632) synthesized with this expedited route map well onto structures previously disclosed in the recent patent literature and serve to highlight our process’ utility for exploring chemical space around this valuable chemotype. While a few analogues (i.e., 26 and 27) were obtained in lower overall yields, it is important to recognize that chemical synthesis in medicinal chemistry prioritizes routes for their efficiency and the structural diversity they can access. Furthermore, during pandemic emergencies the ability to rapidly generate and identify antivirals becomes even more important in fighting waves of infection and viral mutations.

Fig. 3: Substrate scope of C4ʹ-modified nucleoside analogues.
figure 3

*Involves biocatalytic processes. BSA bis(trimethylsilyl)acetamide.

C4ʹ-modified nucleoside analogues of some currently marketed antivirals have never before been synthesized presumably owing to the lengthy routes that would be required to make them. We sought to utilize our process in the synthesis of Ribavirin and Mizoribine analogues. Ribavirin is a broad-spectrum antiviral that is listed as an essential medicine by the World Health Organization (WHO)37 while Mizoribine is a natural product approved in Japan for use as an immunosuppressant during renal transplantation38. Uniquely, these nucleoside analogues contain unusual triazole and imidazole nucleobases respectively. We synthesized their C4ʹ analogues (see Fig. 4; 3639) in just five steps with modest to good overall yields. In MTT cell viability assays, compounds 3639 showed minimal cytotoxicity, an important criterion for antiviral drug development, and, in some instances, were even less cytotoxic than the parent compounds Ribavirin and Mizoribine (see Supplementary Information pages 8889). Next, we turned our attention to the synthesis of C-linked nucleosides, which offer much improved metabolic stability compared to their N-linked counterparts. Using a modified protocol adapted from a recent report by Li et al.20, we attempted a nickel-catalyzed C-glycosylation of 14—this failed to provide the desired product. Surprisingly, using 40 instead of 14 as the glycosyl donor in the C-glycosylation afforded 41 in modest yield. 40 was generated through an alternate ring opening/activation of 13 using 5 equivalents of acetic anhydride in the presence of TESOTf. Finally, by performing the initial aldol step with d-proline instead of l-proline we made 4ʹ-ethyl analogue (43) of Levovirin (42), an l-nucleoside and investigational HCV antiviral39.

Fig. 4: Synthesis of C4ʹ-modified ribavirin and mizoribine analogues, a C-linked analogue, and an l-nucleoside analogue.
figure 4

acac acetylacetone, Terpy terpyridine.

In summary, we have developed a modular five-step de novo synthesis of C4ʹ-modified nucleoside analogues. This short sequence relies on an intramolecular trans-acetalization reaction to enable broad diversification via Grignard addition and Vorbrüggen glycosylation. Given the robustness of this protocol, several additional C4ʹ-modifications and nucleobases are anticipated to be compatible to support medicinal chemistry efforts. In all cases, this process is two to threefold shorter than the previous shortest syntheses reported for each analogue. This highly accessible and convenient protocol should facilitate the exploration of chemical space around this valuable chemotype and aid in the development of antiviral, anticancer, and oligonucleotide therapeutics.

Methods

Procedure for preparation of starting material 6

Aldol adduct 6 was prepared according to the previous literature23. To a solution of ketone 9 (5.10 mL, 38.4 mmol, 1 equiv.) in DMF (19.2 mL) at 0 °C was added l-proline (0.885 g, 7.68 mmol, 0.2 equiv). The reaction mixture was then allowed to stir at this temperature for 30 min after which time 2,2-dimethoxyacetaldehyde solution (60% by weight in water, 8) (5.80 mL, 38.4 mmol, 1 equiv) was added. The reaction mixture was then left to stir at ~4 °C for 72 h. After completion, as monitored by TLC, the reaction mixture was diluted with EtOAc and water. The aqueous layer was washed three times with EtOAc and three times with CH2Cl2. The aqueous layer was checked by TLC to ensure extraction was complete. The organic layer was then separated, combined, dried with Na2SO4, filtered, and concentrated under reduced pressure. The crude reaction mixture was then purified with flash column chromatography (10% → 50% EtOAc in hexanes) to afford a single diastereomer of aldol adduct 6 as a light yellow oil (4.67 g, 52%). The %ee was confirmed to be 93–94 %ee.

General procedure for 1,2-additions

To a solution of 6 (1.00 equiv.) in THF (0.35 M) at −78 °C was added (dropwise) a solution of the Grignard reagent (4.00 equiv). The reaction mixture was allowed to stir at −78 °C for 2 h and then was gradually warmed overnight to room temperature. After completion, as monitored by TLC, the reaction mixture was cooled to 0 °C and was slowly quenched via dropwise addition of saturated ammonium chloride solution. The reaction mixture was filtered to remove solids under vacuum filtration. The filtrate was then diluted with dichloromethane and washed three times water. The organic layer was then separated, dried with Na2SO4, filtered, and concentrated under reduced pressure. The crude reaction mixture was then purified with flash column chromatography to afford syn-10S (40%–72%).

General procedure for intramolecular trans-acetalization reaction

To a solution of syn-10S (1.00 equiv.) and 2,6-lutidine (1.00 equiv.) in CH2Cl2 (0.080 M) at −10 °C was added dropwise TMSOTf (2.00 equiv.) in a glass syringe. The reaction mixture was allowed to stir at −10 °C for 1.5 h and then water (1/3 volume of CH2Cl2 solvent) was added. The reaction was allowed to stir at room temperature for 30 min. After completion, as monitored by TLC, the reaction mixture was diluted with CH2Cl2 and washed twice with water. The organic layer was then separated, dried with Na2SO4, filtered, and concentrated under reduced pressure. The crude reaction mixture was then purified with flash column chromatography to afford 13S (52%–75%).

General procedure for ring opening/peracetylations

To a solution of 13S (1.00 equiv.) in Ac2O:CH2Cl2 (0.20 M; 1:1 mixture) at −5 °C was added (dropwise) a TESOTf (0.625 equiv). The reaction mixture was allowed to stir at −5 °C for 2 h. After completion, as monitored by TLC, the reaction mixture was then diluted with dichloromethane and washed three times with saturated sodium bicarbonate solution. The organic layer was then separated, dried with Na2SO4, filtered, and concentrated under reduced pressure. The crude reaction mixture was then purified with flash column chromatography to afford 14S (61%–74%). In some cases, the α-anomer was also formed. The α-anomer can be combined with 14S for the next step without any change in the results of the subsequent glycosylation step.

General procedure for glycosylations

To a solution of nucleobase (1.00 equiv.) in dry MeCN (0.10 M) at 0 °C was added dropwise BSA (3.00 equiv) and TMSOTf (2.00 equiv). The reaction mixture was then heated to 60 °C for 2 h. The reaction mixture was then cooled to 0 °C and the sugar 14S (1.00 equiv) in 1, 2-DCE (0.30 M) was added to the reaction mixture. The reaction mixture was then heated to 60 °C and left stirring at this temperature overnight. After completion, as monitored by TLC, the reaction mixture was concentrated under reduced pressure and the crude reaction mixture was purified with flash column chromatography to afford PA-15S (70%–100%). PA-15S was then dissolved in a solution of ammonia in methanol and stirred for 16 h. The reaction mixture was then concentrated under reduced pressure to afford pure 15S (100%). The acetamide by-product was removed by blowing a steady stream of air over the concentrated sample overnight.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.