A general strategy for diversifying complex natural products to polycyclic scaffolds with medium-sized rings

The interrogation of complex biological pathways demands diverse small molecule tool compounds, which can often lead to important therapeutics for the treatment of human diseases. Since natural products are the most valuable source for the discovery of therapeutics, the derivatization of natural products has been extensively investigated to generate molecules for biological screenings. However, most previous approaches only modified a limited number of functional groups, which resulted in a limited number of skeleta. Here we show a general strategy for the preparation of a library of complex small molecules by combining state-of-the-art chemistry – the site-selective oxidation of C-H bonds - with reactions that expand rigid, small rings in polycyclic steroids to medium-sized rings. This library occupies a unique chemical space compared to selected diverse reference compounds. The diversification strategy developed herein for steroids can also be expanded to other types of natural products.

M edium-sized rings are largely under-represented in current screening libraries for drug development, and not unexpectedly, they were rarely found in the top 200 brand name and top 200 generic pharmaceuticals 1 . While drug candidates may not enter the commercial market for many reasons, synthetic accessibility undoubtedly affects drug discovery and can bias subsequent drug development. Using state-of-the-art chemistry, we have developed a library of polycyclic mediumsized ring compounds that helps fill the void in an underexplored chemical space.
Natural products have an inherent diversity that is unparalleled by current synthetic libraries and their success rate as drug leads is unrivaled 2 . As such, they have been regarded as one of the most valuable sources for library generation 3 and lead compounds in drug development [4][5][6][7][8][9] . Natural products often contain privileged scaffolds, structural complexity, and abundant stereochemistry throughout the sp 3 -hybridized carbon atom skeleton 6,7 . The diversification of natural products has the potential to generate structurally complex small molecules with varied skeleta and ample stereochemical handles 7 . Synthetic strategies such as diversity-oriented synthesis (DOS) [10][11][12] , biological-oriented synthesis (BiOS) 13 , and function-oriented synthesis (FOS) 14 have been successfully executed to provide efficient access to many structurally complex and functionally diverse small molecules [15][16][17][18][19][20] . However, most of the synthetic strategies published to date suffer from a significant limitation-they rely on the presence and reactivity of specific functional groups (FGs) that are only present in certain classes of natural products.
Conversely, C-H bonds are ubiquitous and exist in almost every organic compound. For this reason, C-H functionalization has become one of the major focuses in organic chemistry. C-H functionalization chemistry has been applied to the synthesis of a number of natural products and pharmaceutical reagents 21 , including target-oriented synthesis and last-stage functionalization of natural products 22 . We envisioned that the diversification of natural product skeleta via C-H functionalization would offer a general strategy for the synthesis of many structurally complex and functionally diverse natural product-like small molecules, while simultaneously preparing compounds in an underexplored chemical space.
Following C-H functionalization, we plan to use the resulting functional group handles for ring expansion reactions to prepare medium-sized rings, which are core components in bioactive natural products and medicinally relevant compounds 23 . Due to unfavorable transannular interactions and entropic effects 24 , the synthesis of medium-sized rings has been a longstanding challenge in organic chemistry. Even now, general strategies for their formation are limited 25,26 . Medium-sized rings aren't subject to the rigidity of small rings, yet they lack the flexibility of linear compounds or macrocycles. Furthermore, their reactivity can be difficult to predict owing to the presence of multiple low energy conformations 27 . These unique properties make them attractive yet challenging targets.
To reach these targets we employed nature's most robust molecular family-terpenes, but more specifically steroids. Of all natural product families, the terpenes exhibit unparalleled chemical and structural diversity 28,29 . Their biosynthesis involves two major phases: cyclization of hydrocarbons by cyclase and oxidation by cytochrome P450 (CYP) enzymes 30 . The chemistry underpinning their biosynthesis, particularly the oxidation component, inspired Baran's group to complete several efficient total syntheses of complex natural products by preparing the hydrocarbon skeleta with low oxidation stage followed by selective C-H oxidation [31][32][33][34][35][36] . Steroids are perhaps the most privileged structures in the history of drug development. In fact, more than 100 different steroid molecules have been approved by the U.S. Food and Drug Administration (FDA) as therapeutics for a wide range of symptoms and diseases, such as inflammation, pain, cancers, and bacterial infections 37 . Not surprisingly, a variety of methods have been developed for the modification of steroids. However, most of these methods focused on simple modifications repeatedly using the same core skeleta and FGs 38,39 . Examples of significant diversification are rare 37,40 .
Inspired by nature's biosynthesis pathways and Baran's twophase strategy for the total synthesis of terpenes 31-36 , we envisioned that oxidation of the most abundant C-H bonds in polycyclic natural products to C-O bonds, followed by ring expansion, may offer a general and efficient strategy for the diversification of natural products and yield a collection of compounds to probe an underexplored chemical space-polycyclic compounds with medium-sized rings (Fig. 1). To probe the feasibility of this two-phase diversification strategy, we investigated the viability of synthesizing medium-sized rings via ring expansion reactions with substrates featuring native C-O bonds, as in hydroxyl or ketone groups. Success in this endeavor then led us to use a sequential C-H oxidation/ring expansion strategy to diversify polycyclic natural products.
Our strategy is a hybrid of target and diversity-oriented synthesis. We targeted the formation of medium-sized rings yet wanted to diversify our products as well. Innovative strategies, such as these, are crucial to making compound libraries that represent underexplored chemical space. We herein report our diversification strategy for the synthesis of natural product-like, polycyclic scaffolds containing medium-sized rings (7-11 membered) using ring expansion reactions and C-H functionalization.

Results
Ring expansion based on native C-O bonds. Our initial studies on the ring expansion strategy used four steroid natural products -dehydroepiandrosterone (DHEA), cholesterol, isosteviol, and estrone-all of which have been reported to exhibit broad pharmacological activities. As shown in Fig. 2, the A-rings of 1a and 1b were expanded to [5.3.0] fused bicycles 3a and 3b through an intramolecular Schmidt reaction. Reduction of the ketone in 3a provided alcohol 4a as a single isomer, whose relative stereochemistry was confirmed by X-ray crystallography. The stereochemistry of 4b was assigned by analogy. The β-keto esters 2a, 2b, and 2c could undergo a formal [2 + 2] cycloaddition reaction with dimethyl acetylenedicarboxylate (DMAD) leading to ringexpanded compounds 5a-c after fragmentation of the cyclobutene moiety to generate the corresponding two-carbon ring-enlargement compound. The esters 5a and 5c were then further elaborated to give [6.3.0] fused bicyclic anhydrides 6a and 6c. Fused bicyclic anhydride moieties present in natural products have been reported to exhibit a diverse range of bioactivity 41 , and the anhydrides can also be easily derivatized, as discussed later (Fig. 2a, Supplementary Fig. 5). The bridged five-membered ring of isosteviol 1d could be expanded to six-membered β-keto ester 7 by treating with BF 3 •Et 2 O and ethyl diazoacetate. Ester 7 could also undergo ring expansion upon reaction with DMAD and followed by AcOH/HCl to give [6.3.0] fused bicyclic anhydride 8. The six-membered ring in 7 could also be converted to lactam 9 by a Beckmann rearrangement (Fig. 2b, Supplementary Fig. 6). The D-ring of 13 could be expanded to benzannulated mediumsized ring 14 with high diastereoselectivity (d.r. = 18:1) via ring expansion/aryne insertion chemistry. The relative stereochemistry of 14 was confirmed by NOE experiments. Interestingly, the major stereoisomer 14 could be isomerized to the minor stereoisomer 15 using TBAF in THF. The expansion of the D-ring in estrone (Fig. 2c, Supplementary Fig. 7) and DHEA (Fig. 2d, Supplementary Fig. 7) could be realized by an acylation/ring expansion sequence leading to nine-membered rings bearing a βketo ester motif within polycyclic compounds 12 and 17 42 . Furthermore, the newly formed medium-sized ring 17 could then undergo an additional Schmidt reaction leading to 18 (Fig. 2d, Supplementary Fig. 7). The D-ring of DHEA could also be expanded to other medium-membered ring scaffolds through a similar two-carbon ring expansion strategy (Fig. 2d, Supplementary Fig. 7). The transformations shown in Fig. 2 indicate both the feasibility and generality of using native C-O bonds for ring expansion to form medium-sized rings in natural product diversification.
Sequential C-H oxidation/ring expansion. C-H functionalization reactions, especially C-H oxidation, have recently emerged as a powerful strategy for the derivatization of complex organic compounds. After determining that polycyclic steroids containing native C-O bonds could be expanded to contain medium-sized rings, we significantly diversified natural products by applying a two-phase strategy-installation of C-O bonds via C-H oxidation followed by ring expansion. Among the C-H oxidation methods, electrochemical oxidation generates less toxic waste and is compatible with a wide range of functional groups, making it an attractive complementary strategy to traditional chemical reagents 43,44 . Employing the electrochemical allylic C-H oxidation method developed by Baran 45,46 , this procedure, in conjunction with a Beckmann rearrangement, allowed us to synthesize several seven-membered lactams ( Fig. 3a, b, Supplementary Fig. 8, 9). Using a site-selective copper-mediated C-H oxidation 47 and Beckmann rearrangement, we expanded the Cring of estrone and DHEA (Fig. 3c, Supplementary Fig. 10). Lastly, we used a chromium-mediated benzylic C-H oxidation 48 of 1j to yield ketone 28, which provided lactam 29 after a Beckmann rearrangement (Fig. 3d, Supplementary Fig. 11).
Diversification of more complex natural products. We next examined the scope of the C-H oxidation/ring expansion strategy by applying it to the diversification of two significantly more complex natural products-picfeltarraegenin 30 and kirenol 36. Protection and reduction with NaBH 4 converted natural product 30 to 32. From here, the electrochemical allylic C-H oxidation/ ring expansion strategy was successfully applied to the synthesis of seven-membered lactam-containing compound 35, whose relative stereochemistry was confirmed by NOE experiments (Fig. 4a, Supplementary Fig. 12). Protection of 36 by cyclic carbonate and MOM ethers converted natural product 36 to 37. Allylic C-H oxidation of 37, followed by oxidation of the OH group, afforded enone 38. Finally, a Beckmann rearrangement yielded lactam 39, whose relative stereochemistry was confirmed by NOE experiments (Fig. 4b, Supplementary Fig. 12).
Further diversification of medium-sized rings. Previously synthesized products containing medium-sized rings offered multiple opportunities for further functionalization. To demonstrate the feasibility of our C-H oxidation/ring expansion strategy for the preparation of compound libraries, we further derivatized some of the functional groups to yield a collection of 150 compounds (Fig. 5). For example, hydroxyl groups were derivatized to carbamates, β-keto esters underwent alkylation, anhydrides were converted to imides, and ring-opening reactions were performed on some of the medium-sized ring. Removal of the TBS protecting group provided the corresponding alcohols, which were then converted into a library of carbamates by treatment with CDI and a range of amines. (Fig. 5a, Supplementary Fig. 13). Alkylation of the β-keto ester scaffold afforded a single stereoisomer, which could also be converted to libraries of carbamates 45 and triazoles 46 (  15). Ring-cleavage of 5b and 5c provided 49, which possesses three orthogonal FGs that can be further modified to larger collections of compounds. For example, direct reaction of the anhydride with amines resulted in the formation of imides 50. Subsequently, the carboxylic acid moiety was easily converted to libraries of amides 51 and 52. The alcohol moieties of libraries of 52 could react with CDI and amines to produce libraries of carbamates 54. The carboxylic acid and alcohol FGs in library 50 could react with CDI, followed by treatment with amines to yield library 53 (Fig. 5e, Supplementary Fig. 16).
All of the compounds we prepared are natural product-like. They contain structural features often found in natural products, such as polycyclic ring systems, sp 3 hybridized carbons, and abundant stereogenic centers, while also being densely populated with FGs capable of hydrogen bonding. Since medium-sized rings can adopt multiple low energy conformations, these compounds may exhibit multiple modes of binding in biological contexts. Most remarkably, a large number of scaffolds with diverse structural features can be accessed using a relatively small set of C-H oxidation and ring expansion reactions.
Chemoinformatic analysis. We used principal component analysis (PCA) 49 to compare our compound library with approved therapeutics and natural products following the protocols employed by Tan 50 . We selected 30 representative compounds from our polycyclic, medium-sized ring library ( Supplementary  Fig. 3) and compared them with 40 top-selling small-molecule drugs in 2016 (without including the top-selling steroid drugs) 51 , 25 top-selling small-molecule steroid drugs in 2016 51 , a set of 25 steroid and terpenoid natural products with diverse oxidative stage and skeleta 28,29,38 , and a set of 25 diverse natural products with a medium-sized ring 16 Table 3) and are represented in 2D plots (Fig. 6a-c, for expanded version see Supplementary Fig. 4a-c).
As shown in Fig. 6a-c, the parameters that have the greatest contributions to PC1 are H-Bond acceptors, number of rotatable bonds, polar surface area (tPSA) and molecular weight, which shift molecules right along the x axis in the plots of PC1/PC2 and    These parameters have the effect of moving the molecules to the negative direction along the PC3 axis in the plots of PC1/PC3 and PC2/PC3, and the number of O atoms, which shifts compound in a positive direction in these plots.
As shown in Fig. 6a-c, there are significant overlaps between our medium-sized ring library compounds and steroid natural products, especially with top-selling steroid drugs, suggesting the drug-likeness of our compounds. In addition, expanding the small rigid rings in steroid natural products to more flexible medium-sized rings covered broader chemical space occupied by steroid natural products and steroid drugs, implying the great potential of our compounds to occupy underexplored biological space. In two of the three variations (Fig. 6a and c), the mediumsized ring library occupied a distinct region of chemical space compared to the top-selling drugs. For example, our polycyclic compounds derived from steroid natural products generally have more stereogenic centers (average 7.4), OpreaRingCount (average 4.9), and rings (average 5.1), than those for drugs (average 2.7, 3.6, 3.7) (Supplementary Table 2).
We also carried out a principal moment of inertia (PMI) analysis to compare the three-dimensional shapes of the lowest-energy conformations of our library with the different reference sets as previously reported (Fig. 6d, for expanded version see Supplementary Fig. 4d) 53 . PMI analysis outlines differences of the threedimensional molecular shapes of our medium-sized rings compared to the reference sets of libraries. Steroid drugs and diverse steroid and terpenoid natural products reside along the rod-disc (left) side of the triangle, with a preference for the rod vertex. Although the distinctions between the different reference sets are less obvious in the PMI plot, we were pleased to find that our library covers a much broader three-dimensional chemical space than known steroid drugs and diverse steroid and terpenoid natural products.

Discussion
In summary, we have developed a convenient approach for the rapid preparation of large numbers of polycyclic compounds. These small molecules boast diverse skeleta, multiple stereogenic centers, and medium-sized rings. Designing and synthesizing a library of this magnitude is no small feat, yet it is a timely addition to current compound libraries, most of which lack significant diversity. The success of our approach is predicated on the synergy that results from using state-of-the-art site-selective oxidation of C-H bonds in conjunction with ring expansion reactions. By transforming small, rigid rings into conformationally-labile medium-sized rings, this methodology can have significant biological implications. Mediumsized rings, which have multiple low energy conformations, can conceivably engage in multiple modes of binding, as opposed to small rings, which are often locked in a single conformation. Using chemoinformatics, we demonstrated that our synthesized compound library occupies unique chemical space compared to diverse reference compounds. This diversification strategy, while developed for steroid natural products, has the potential to be further expanded to other types of natural products or natural product-like compounds. We also demonstrated that our compounds with diverse scaffolds can be easily further derivatized, which is important for the discovery of bioactive small-molecule probes. In the future, natural product diversification by sequential C-H oxidation and ring expansion can be a general approach in the continued evolution of diversity-driven synthetic strategies. Our medium-sized ring library is now being screened against a wide range of challenging biological targets. Breakthroughs in organic synthesis often bias the drug discovery landscape in the years that follow. As computational methods, automation, and high-throughput screening continue to gain momentum, unsolved problems in synthetic methodology can lead to drug development bottlenecks 54 . Our diversification strategy allows us to diversify natural products, easily synthesize medium-sized rings, a historical hurdle in organic synthesis, and access uncharted chemical space. We hope that the methods described herein will motivate others to look at organic synthesis as a call to action instead of a constraint.

Methods
General remarks. All detailed synthetic methods are included in the supplementary information. Some representative general procedures are provided here.
General procedure for the formation of β-keto ester. A solution of diisopropylamine (1.3 equiv) in tetrahydrofuan (THF) was cooled to −78°C. To the above solution was added n-BuLi (1.25 equiv) dropwise. The resulting solution was stirred for 45 min under argon. To the above reaction mixture was added a solution of ketone (1.0 equiv) in THF at −78°C. The reaction mixture was kept for 2 h at this temperature. Ethyl cyanoformate (1.3 eq) was added to the above solution. The resulting mixture was stirred at −78°C for 1 h. The reaction was allowed to warm to room temperature. After quenching with NH 4 Cl solution, the mixture was diluted with ethyl acetate and washed with brine (50 mL × 2). The combined organic phase was dried over sodium sulfate and concentrated under vacuum. Flash column chromatography over silica gel afforded the β-keto ester.
General procedure for the alkylation of β-keto ester. To a solution of β-keto ester (1.0 equiv) and hexamethylphosphoramide (HMPA) (2.0 equiv) in THF (10 mL/mmol) was added NaH (1.3 equiv) under argon. The reaction mixture was stirred for 1 h. Then alkyl iodide (5.0 equiv) was added. The mixture was stirred for 24 h. After quenching with NH 4 Cl solution, the mixture was diluted with ethyl acetate and washed with NaS 2 O 3 and brine. The organic phase was dried over sodium sulfate and concentrated under vacuum to afford the crude product. Flash column chromatography over silica gel afforded the product.
General procedure of Backman rearrangement. To a solution of ketone (1.0equiv) in EtOH (10 mL/mmol) was added hydroxylamine hydrochloride (10.0 equiv) and KOAc (10.0 equiv). The reaction mixture was heated at reflux for 3 h. The mixture was cooled to room temperature. The volatile EtOH was removed under vacuum and the resulting residue was diluted with ethyl acetate and washed with brine (20 mL × 2). The organic phase was dried over sodium sulfate and concentrated under vacuum to afford crude product, which was directly used in the next step without further purification.
To a solution of the above crude product in pyridine (5.0 mL/mmol) was added p-toluenesulfonyl chloride (2.0 equiv) and dimethylaminopyridine (0.1 equiv). The resulting reaction mixture was heated at 60°C overnight, cooled to room temperature, diluted with ethyl acetate (100 mL), and washed with 2 N HCl, NaHCO 3 and brine sequentially. The organic phase was dried over sodium sulfate Alternative procedure for the formation of carbamates. To a solution of alcohol (0.10 mol) in dichloromethane (2 mL) was added N, N-dialkyl-1H-imidazole-1carboxamide (0.20 mmol) and t-BuOK (22.4 mg, 0.20 mmol). The reaction mixture was stirred at room temperature overnight. The mixture was diluted with ethyl acetate (30 mL) and washed with brine (10 mL × 2). The organic phase was dried over sodium sulfate and concentrated under vacuum. Flash column chromatography over silica gel afforded the carbamates.
General procedure for the formation of imides. To a solution of 8 (44.2 mg, 0.1 mol) in dichloromethane (0.5 mL) and toluene (3.0 mL) was added amine (0.5 mmol). The reaction mixture was stirred for 30 min and then heated at reflux for 2 h. The mixture was cooled to room temperature, and concentrated under vacuum. Flash column chromatography over silica gel afforded the imide. (Note, DMF was used as solvent for amino acids).
Alternative procedure for the formation of imides. A solution of 6a (0.50 mmol) in aqueous NaOH (30%, 5 mL) and THF (5 mL) was stirred for 2 h at 50°C. The reaction mixture was cooled to 0°C and then NaBH 4 (76 mg, 2.00 mmol) was added in one portion. The reaction mixture was allowed to warm to room temperature and stirred for 3 h. The reaction mixture was concentrated under vacuum, acidified by the slow addition of 1 N HCl, extracted with EtOAc (3 × 30 mL). The combined organic extracts were washed with H 2 O (30 mL), brine (30 mL), and dried over Na 2 SO 4 . The organic phase was concentrated under vacuum and used for the next step. To a solution of the above crude product in dichloromethane (0.5 mL) and toluene (3.0 ml) was added amine (0.5 mmol). The reaction mixture was stirred for 30 min then heated at reflux for 2 h. The mixture was cooled to room temperature, and concentrated under vacuum. Flash column chromatography over silica gel afforded the imide.
To a solution of the above imide in dichloromethane (5 mL) was added Dess-Martin periodinane (848 mg, 2.0 mmol). The reaction mixture was stirred at room temperature for 3 h. The mixture was diluted with ethyl acetate (30 mL) and washed with NaS 2 O 3 (10 mL) and brine (10 mL × 2). The organic phase was dried over sodium sulfate and concentrated under vacuum. Flash column chromatography over silica gel afforded the final imide product.

Data availability
For the details of the synthetic procedures, 1 H, 13