Near quantitative synthesis of urea macrocycles enabled by bulky N-substituent

Macrocycles are unique molecular structures extensively used in the design of catalysts, therapeutics and supramolecular assemblies. Among all reactions reported to date, systems that can produce macrocycles in high yield under high reaction concentrations are rare. Here we report the use of dynamic hindered urea bond (HUB) for the construction of urea macrocycles with very high efficiency. Mixing of equal molar diisocyanate and hindered diamine leads to formation of macrocycles with discrete structures in nearly quantitative yields under high concentration of reactants. The bulky N-tert-butyl plays key roles to facilitate the formation of macrocycles, providing not only the kinetic control due to the formation of the cyclization-promoting cis C = O/tert-butyl conformation, but also possibly the thermodynamic stabilization of macrocycles with weak association interactions. The bulky N-tert-butyl can be readily removed by acid to eliminate the dynamicity of HUB and stabilize the macrocycle structures.


Materials
All reagents were purchased from Sigma-Aldrich or TCI and used as received unless otherwise noted. Deuterated chloroform (CDCl3) was purchase from Cambridge Isotope Laboratory. Tetrahydrofuran (THF) were dried with a column packed with alumina. HPLC grade 0.1% TFA-H2O and acetonitrile were purchased from Fisher Scientific Company LLC (Hanover Park, IL, USA).

Nuclear Magnetic Resonance (NMR) Spectroscopy
1 H and 13 C nuclear magnetic resonance (NMR) spectra were recorded on a Varian U500, VXR500, UI500NB or a Bruker Carver B500 spectrometer, with shifts reported in parts per million downfield from tetramethylsilane and referenced to the residual solvent peak. Nuclear overhauser effect (NOE) spectra were recorded on an Agilent VNS 750NB spectrometer. MestReNova 11.0.1 was used to analyze all spectra.

Fourier-transform infrared spectroscopy (FTIR)
Fourier transform infrared (FT-IR) spectra were performed using a Spectrum 100 spectrometer (Perkin Elmer) on a KBr salt plate.

Electrospray ionization mass spectrometry (ESI-MS)
Electrospray ionization (ESI) mass spectrometry was performed on a Waters Quattro Ultima II. Solvent media was 50% Methanol solution with 0.2% formic acid.

Gel Permeation Chromatography (GPC)
Gel permeation chromatography (GPC) was performed in chloroform at a flow rate of 1.0 mL/min on a system equipped with a Model1260 Infinity isocratic pump (Agilent Technology) in series with a 717 Autosampler (Waters) and size exclusion columns (50 Å, 100 Å Phenogel columns, 5 µm, 300 × 7.8 mm, Phenomenex) connected in series. An Optilab rEX refractive index detector (Wyatt Technology) operating at a wavelength of 658 nm were used as detector. Samples were filtered through a 0.45 µm PTFE filter before analysis.

Matrix-assisted laser desorption/ionization time-of-flight mass spectroscopy (MALDI-TOF MS)
MALDI-TOF spectra were obtained on a Bruker Daltonics UltrafleXtreme MALDI TOFTOF equipped with a nitrogen laser of 337 nm. The sample was dissolved in chloroform at a concentration of 10 mg/mL. The cationization agent CF3COONa was dissolved in THF at a concentration of 10 mg/mL. The matrix used was α-Cyano-4-hydroxycinnamic acid (CHCA, Sigma-Aldrich) and was dissolved in THF at a concentration of 10 mg/mL. Solutions of matrix, sample and cationization agent were mixed in a volume ratio of 4/1/1. The mixed solution was spotted (1µL) on the MALDI sample plate and air-dried. All spectra were recorded in reflectron mode.

SEM
The size and morphology of the samples were characterized using a Hitachi S-4800 SEM operated at 10 kV. The samples were first deposited on a clean Si wafer, dried in ambient conditions and then we conducted a Au sputtering to the samples before SEM characterization.

Molecular dynamics simulation: Ring model generation:
All-atom molecular models of 1mer, 2mer, 3mer, chloroform solvent were generated using the Automated Topology Builder (ATB) server (http://atb.uq.edu.au) (1) and modeled using the GROMOS 54A7 force field (2). Atomic partial charges were assigned using semiempirical quantum mechanical calculations conducted using the MOPAC method (3) and all molecules carried zero net charge.

Simulation set up
Molecular dynamics simulations were conducted using the GROMACS 4.6 simulation suite (4). Lennard-Jones interactions were shifted smoothly to zero at 1.4 nm, and interactions between unlike atoms specified by Lorentz−Berthelot combining rules (5). Coulomb interactions were treated by Particle Mesh Ewald (PME) with a real-space cutoff of 1.4 nm and a 0.12 nm reciprocal-space grid spacing (6). Bond lengths were fixed to their equilibrium values using the LINCS algorithm (7). Temperature was maintained at 300 K using a Nosé-Hoover thermostat (8) and pressure at 1.0 bar using an isotropic Parrinello−Rahman barostat (9). Newton's equations of motion were integrated using the leap-frog algorithm with time step of 2 fs (10). System configurations were saved for analysis every 2 ps. Calculations were conducted on NVIDIA Quadro K1200 GPU cards achieving execution speeds of about 30 ns/day. Simulation trajectories were visualized using Visual Molecular Dynamics (VMD) (11). For single macrocycle simulations, one n-mer was placed in a 4×4×4 nm3 box, for multiple macrocycle simulations, 48 1mers, 24 2mers, 16 3mers, 12 4mers, and 8 6mers each was placed in a 8×8×8 nm3 box, this is to conserve the total monomers to be 48. Boxes were then filled with chloroform molecules and subjected to steepest descent energy minimization until the maximum force on any given atom was less than a threshold of 10 kJ/mol•nm. Atomic velocities were initialized from a Maxwell distribution at 300 K, and the systems were then simulated for 50 ns, the non-bonding energies were computed over the equilibrium portion within the last 20 ns. The non-bonding energies only considered intra and inter macrocycle interactions, and ignore solvent-solvent or solvent-macrocycle interactions.

PMF calculation using umbrella sampling
Umbrella sampling of two macrocycles at different center of mass separations were conducted in a box of size of 8×8×8 nm3. 34 individual umbrella sampling runs on the macrocycle COM distance r were conducted over the range r = (0.2 nm, 3.5 nm) with umbrella windows placed uniformly at 0.1 nm intervals and harmonic restraining potentials with force constants of 1000 kJ/mol•nm2 placed at the center of each window (12). Each umbrella simulation was conducted for 5 ns, and the unbiased potential of mean force (PMF) curve W(z) reconstructed from the biased umbrella simulation trajectories using the Weighted Histogram Analysis Method (WHAM) (13,14) implemented using g_wham in GROMACS 4.6 (15).

Thermodynamic model
Assume that there are monomers in the solvent, these monomers could aggregate into clusters of monomers, and the number of the cluster with size to be , the system could be characterized by the state vector { }, which satisfies ∑ = . So the probability of observing { } is: Eq.1 Where ({ }) is the potential energy of the state { }, ({ }) is the entropy of the state { }, = 1/ , and is the Boltzmann constant, is the partition function and is also a constant. In Eq.1, entropy term is: Where = 3 is the volume, Λ is the thermal de Broglie wavelength of a -mer: Where ℎ is the Planck constant, is the mass of a monomer. The first term on the r.h.s of Eq.2 is the translational entropy of the particles of interest, the second term is the configuration entropy of a cluster of monomers, the last term is the solvent entropy, and is independent of { }. For simplicity, we treat each cluster rigidly and ignore the second term, and set the solvent entropy to be a constant. The energetic term in Eq.1 can be written as: Where is the potential energy of a cluster of monomers, is the energy of the particle-particle and particle-solvent interactions, is the potential energy of the solvent, for simplicity, we treat the last two terms in Eq.4 as constant and independent of { }.
After plugging Eq.2 and Eq.4 into Eq.1, we obtain: Eq.5 Where Z is the partition function and absorbs all assumed constants: , is the density of the particle of interest, which satisfies ∑ = 1. Since Eq.5 becomes:

S6
Eq.9 The final state { } after system reaches equilibrium can be estimated by minimizing the objective function in Eq.9, subject that ∑ = 1. The remaining unknowns in Eq.9 are , the potential energies for different size of aggregates, which could be measured in molecular simulation.

Temperature, concentration and solvent dependence of HUM1 formation
Temperature: the formation kinetics vary with temperature, but the final yields were all near quantitative from 20 to 75 o C.
N1 and A1 were mixed in 1:1 ratio in CDCl3 (or C2D4Cl2 for higher temperature) with a final concentration of 50 mM. The mixture was capped and incubated at 20 o C, 37 o C, 60 o C, 75 o C respectively. Final products and yields were confirmed by 1 H NMR. At higher temperature, HUM1 was formed in near quantitative yield in less than 1 h. At lower temperature, the equilibration took longer because of lower reversibility. At room temperature, the mixture was monitor for 5 days and final yields was also near quantitative. Each group was repeated three times.
Concentration: HUM1 was formed in near quantitative yields with a concentration range from 1 to 500 mM.
N1 and A1 were mixed in 1:1 ratio in CDCl3 with final concentrations of 1 mM, 5 mM, 25 mM, 100mM, 200 mM and 500 mM. The mixtures were capped and incubated at 60 o C. Final products and yields were confirmed by 1 H NMR. All groups gave near quantitative yields after overnight incubation. At higher concentrations, white crystalline precipitates were observed because of lower solubility of the macrocycle. Each group was repeated three times. Supplementary Figure 4: Synthesis of compound A3 A3: 1,3-Bis(bromomethyl)benzene (264 mg, 1 mmol) was dissolved in 10 mL DMF. Tert-butyl amine (438 mg, 6 mmol) and K2CO3 (138 mg, 1 mmol) were added to the solution. The suspension was stirred at room temperature for about 24 h and the reaction was monitored by TLC. After completion, the reaction was quenched with 20 mL water, and then extracted with DCM (30 mL × 3). The organic layer was combined, washed twice with brine and then dried with anhydrous Na2SO4. Then solvent was removed and crude product was purified by flash column chromatography. Final product was obtained as colorless oil, yield 85%.
A4: 3,5-Bis(chloromethyl)pyridine (176 mg, 1 mmol) was dissolved in 10 mL DMF. Tert-butyl amine (438 mg, 6 mmol) and Cs2CO3 (326 mg, 1 mmol) were added to the solution. The suspension was stirred at room temperature for about 12 h and the reaction was monitored by TLC. After completion, the reaction was quenched with 20 mL water, and then extracted with DCM (30 mL × 3). The organic layer was combined, washed twice with brine and then dried with anhydrous Na2SO4. Then solvent was removed and crude product was purified by flash column chromatography. Final product was obtained as orange/yellow solid, yield 98%. Supplementary Figure 6: Synthesis of compound A5 A5: step 1: LiAlH4 (2.277g, 60 mmol) was suspended in 100 mL THF under ice bath. A solution of dimethyl 5-hydroxyisophthalate (4.20 g, 20 mmol, 100 mL THF) was slowly added to the LiAlH4 suspension under vigorous stirring. The mixture was then stirred at 60 o C for 2 h and monitored by TLC. After completion, the reaction was quenched by saturated Na2SO4 solution (1 mL) and neutralized with concentrated HCl solution (~2 mL). The mixture was then dried with anhydrous Na2SO4. The solid was filtered off and the filtrate was concentrated giving (5-hydroxy-1,3-phenylene)dimethanol as colorless oil, which was directly used in the next step. 1 H NMR (500 MHz, CDCl3): δ 6.81 (s, 1H), 6.71 (s, 2H), 4.53 (s, 4H).

Supplementary Figure 8: Synthesis of compound A7
A7: 1,4-Bis(bromomethyl)benzene (264 mg, 1 mmol) was dissolved in 10 mL DMF. Adamantyl amineamine (907.5 mg, 6 mmol) and K2CO3 (138 mg, 1 mmol) were added to the solution. The suspension was stirred at 50 o C for about 24 h and the reaction was monitored by TLC. After completion, the reaction was quenched with 20 mL water, and then extracted with DCM (30 mL*3). The organic layer was combined, washed twice with brine and then dried with anhydrous Na2SO4. Then solvent was removed and crude product was purified by flash column chromatography. Final product was obtained as colorless oil, yield 85%.

Generation of the macrocycle library and calculation of the yields
General procedure: The diisocyanate (0.1 mmol) and diamine (0.1 mmol) were mixed directly in CDCl3 (2 mL) and then incubated at 60 o C. The macrocycle formation process was monitored by 1 H NMR and further confirmed by 13 C NMR and MALDI-TOF. The yields were calculated by the integration from 1 H NMR spectra, as is shown by the example below. The NH proton from all species (macrocycles and oligomers) fell into the specified region (6.15~6.4 ppm). The integration of this region was set as 1. The peak at 6.2 ppm came from the target macrocycle and its integration was 0.61. The yield of the macrocycle was considered 0.61 in this case. Different peaks may be chosen to calculate the yields based on how well the peaks can be differentiated.
It should be noted that the macrocycle formation was thermodynamically controlled; thus, factors such as concentration and temperature can affect the final equilibration. Also to achieve high yields of the macrocycles, the exact molar ratio of the building blocks was very important. So the purity of the building blocks and the weighing can affect a lot. For the yields reported here, all the experiments were performed under the above mentioned condition and were repeated three times.  Structures and yields of the macrocycles were listed below.

Supplementary
[NiAj]x refers to the macrocyle obtained from the combination of the diisocyanate Ni and the diamine Aj, x= 1 or 2. The number refers to its yield under the reaction condition. Since only a few single crystals structures were obtained, the conformations drawn here may or may not be right.

S15
Supplementary Figure 11: Structures of the macrocycles in the library and their yields under experimental condition.
[NiAj]x refers to the macrocycle from the combination Ni and Aj.

Monitoring of the macrocycle formation kinetics
The formation kinetics of some of the combinations (N1~N4 and A1~A3) were tested and monitored by NMR. The diisocyanate (0.1 mmol in 1 mL CDCl3) and diamine (0.1 mmol in 0.5 mL CDCl3) were quickly mixed, rinsed with 0.5 mL CDCl3 and then subjected to NMR immediately. Then the mixtures were kept at 60 o C and NMR spectra were taken at various intervals. The yields were calculated by the integration from 1 H NMR spectra, as is shown in 3.

Representative NMR spectra
Several representative 1 H and 13 C NMR spectra as well as their corresponding MALDI-TOF spectra were shown below. It should be noted that all the spectra came directly from the mixture without any purification ( 50 mM in CDCl3, incubated at 60 o C).

DFT calculations
We performed quantum chemistry calculations using the Gaussian09 package to determine the relative structural interaction energies between cis and trans. To achieve high accuracy, calculations were performed with Density Functional Theory (DFT) and Møller-Plesset second order perturbation theory (MP2. For the DFT calculations, PBE functional at generalized gradient approximations (GGAs) level and Becke's three parameter hybrid exchange functional and Lee-Yang-Parr correlation functional (B3LYP) at hybrid level were selected. 6-31/G(d,p) basis set was used for both DFT functionals and aug-cc-pVDZ basis set was used for MP2. All calculations were performed at gas phase. Energies of two set of model compounds which vary only in one substituent and represent aliphatic HUBs (MC2 and MC2') and aromatic HUBs (MC7 and MC7') were calculated. The energy differences between cis and trans conformations ( Gtrans-cis) were listed below.

DFT calculations on different conformers of MC4
DFT calculations of the linear analogue MC4 (the [1:1] adduct of N2 and A2 with reactive chain ends) were performed at the PBE/6-31G(d) and B3LYP/6-31G(d) levels of theory. 12 different conformations were generated by rotating around the C(O)-N(tBu) bond of the adduct, which converged into 5 different metastable conformations after local energy minimization. Relative free energy was defined as the energy difference with respect to most stable one. d was defined as the distance between the N of the free amine and C of the free isocyanate between which the reaction happens. 1 was defined as the angle between vectors v1 and v2, which characterized the degree of folding of the urea chain. 2 was defined as the angle between vectors v3 and v4, which characterized the position of the bulky t-Bu group with respect to the carbonyl group. Cisconformation is defined as 2

Exchange reaction of two macrocycles and their model compounds
For the exchange reaction, 5 mg of each compound was dissolved in 600 L CDCl3 in a sealed NMR tube. The NMR instrument was set at 55 o C and the sample was allowed to equilibrate in the instrument at 55 o C for 5 min. The sample was then ejected and 20 L butylisocyanate was quickly added to the tube and the sample was subjected to the NMR instrument immediately. Spectra were taken at various intervals. Remaining ratios were determined by the integration of the original peaks and new peaks. Since butylisocyanate was in large excess and its concentration can be regarded constant, the exchange reactions can be considered pseudo-first order.

Control experiment
We performed a similar reaction with only the t-Bu group in A1 changed to i-Pr (A1'). In this case, only a mixture of oligomeric molecules was obtained even with prolonged incubation time.
Although it is known that A1' based hindered ureas are less dynamic than A1 based ones, our previous work has shown that the k-1 of the corresponding urea structure is still decent for the mixture to reach its thermodynamically stable state under mild conditions. The mixture was proved to reach chemical equilibrium after 15 days without production of exclusive macrocyclic products.

S43
Supplementary Figure 47: Concentration dependent NMR of MC9 (500 MHz, CDCl3). The NH peak and t-Bu peak were enlarged for clarity.

NOE study:
Firstly the 1 H NMR of a mixture of HUM1 (25 mM) and MC5 (50 mM) in CDCl3 was taken on an Agilent VNS 750NB spectrometer. The peaks of them were resolved and then the t-Bu peak of HUM1 was irradiated at 1.47 ppm with a selective band width of 13.2 Hz. The NOE peaks were recorded.

Simulation results
To further prove our hypothesis that t-Bu⋅⋅⋅macrocycle interactions contribute to the high yield, atomistic level molecular simulation was performed to calculate the average monomeric nonbonding energy in macrocycles with various sizes n (denoted as n-mer), either in single macrocycles states (denoted as nmer-s) or in multiple macrocycles clusters (denoted as nmerm). A thermodynamic model which utilizes the computed monomeric energies was proposed to compute the stable ring size distribution which minimizes the system free energy (see part 2). For N1+A1 reaction, the average monomeric energy computed in 1mer, 2mer and 3mer are shown in (Supplementary Figure 48a), these energies are then plugged into the thermodynamics model to predict the stable distribution. When inter-macrocycle interactions were omitted (nmers system), there is coexistence of 2mer, 3mer and 4mer, with 3mer being the major species (Supplementary Figure 48b). However, when the interactions between macrocycles were considered (nmer-m system), the corresponding size distribution showed the predominance of 2mer alone, which was consistent with the experimental results. This implies the interactions between macrocycles stabilize the 2mers and drive it to be the much more favored species. No such effect was observed in the control N1+A1' system.
To further investigate the mode of interactions between the macrocycles, dimerization potential of mean force (PMF) was computed between two 2mer and 3mer macrocycles (Supplementary Figure 48c) using umbrella sampling technique; 1mer system was not included since it has extremely high ring strain which overwhelms the intermolecular interactions. The normalized stabilization energy is defined as (PMF well depth)/n. 2mer showed a higher stabilization free energy (4.0 kT) than 3mers (2.5 kT) (Supplementary Figure 48c, inset). To see how the macrocycles interact with each other, the motions of the macrocycles were tracked around the well region where the center of mass distance of two rings is about 0.8 nm. For the 2mer system, a "pocket effect" was clearly observed, with t-Bu group from one macrocycle sitting in the cavity of another one, which lower the stabilization energy and was in consistence with the structure in the solid state (Supplementary Figure 48e). In contrast, for the 3mer system, no clear 'pocket effect' was observed. This was further supported by the probability density distribution of the distance between t-Bu groups and the 'pocked center' of the other macrocycle (Supplementary Figure 48d). In the 2mer system, the t-Bu group has a much higher probability to appear very close to the pocket center of the other macrocycle, signaling the existence of the extra interaction between t-Bu ---macrocycle. In the 3mer system, the distribution is more scattered and does not show an obvious preference.
Supplementary Figure 48: a Average monomeric energy of N1+A1 and the control N1+A1' system; number means size of the ring, 's' only considers the energy of a single ring, 'm' includes interactions between rings. For N1+A1, The energy of 2mer-m system is lower because of ringring interaction. While for N1+A1' system, ring-ring interactions did not result in energy drop. b Calculated size distribution of the tBu system based on E_s and E_m. c Dimerization potential of mean force (PMF) between two 2mer and 3mer macrocycles. Inset: normalized well depth d Probability density distribution of the distance between tert-butyl group and the pocket center of the other ring around the well of the PMF curve for for different sized rings in both t-Bu and i-Pr systems. e Simulated mode of interactions of two HUM1 in the PMF well region.

Self-assembled fiber of UM1
General procedure for de-tert-butylation: The HUMs were treated with TFA for 5 min at room temperature. Then the TFA was removed under vacuo. The solid was washed with diluted NaHCO3 solution, water, Acetone and then dried under vacuo.

Binding of UM1 with anions
Urea macrocycles have been shown to be potent anion binders. As a proof of concept, two urea macrocycles obtained by the coupling of HUB chemistry and acid-assisted de-tert-butylation, UM1 and UM2 were tested for their anion binding ability. UM1 showed selective binding to the organic salts such as phosphates and acetates and no interaction to halides while UM2 showed sizeselective binding to different halides.
To determine the binding constant, UM1 and tetrabutylammonium phosphate monobasic were chosen as the model system. The concentrations of the host molecule UM1 were kept constant and different equivalents of guest tetrabutylammonium phosphates were added. NMR were taken to monitor the chemical shift changes.

Crystallographic Data
Singles crystals of HUM1 were grown by slow evaporation of a solution of HUM1 in a mixture of THF and Acetonitrile. The crystals diffracted very weakly due to the lack of heavy atoms and disordered solvents within the cavity. Intensity data were collected on a Bruker D8 Venture kappa diffractometer equipped with a Photon 100 CMOS detector. An Iµs microfocus source provided the Cu Kα radiation (λ = 1.54178 Å) that was monochromated with multilayer mirrors. The collection, cell refinement and integration of intensity data was carried out with the APEX2 software (15). Face-indexed absorption corrections were performed numerically with SADABS (16). The structures were solved with the intrinsic phasing methods SHELXT (17). All structure were refined with the full-matrix least-squares SHELXL program. Analysis of the available data results in a chemically reasonable structure model that confirms the target molecule was synthesized. Singles crystals of HUM3 ([N3A2]2) were grown by slow evaporation of a solution of HUM2 in a mixture of chloroform and hexane. The crystals diffracted very weakly due to the lack of heavy atoms and disordered solvents within the cavity. Intensity data were collected on a Bruker D8 Venture kappa diffractometer equipped with a Photon 100 CMOS detector. An Iµs microfocus source provided the Cu Kα radiation (λ = 1.54178 Å) that was monochromated with multilayer mirrors. The collection, cell refinement and integration of intensity data was carried out with the APEX2 software (15). Face-indexed absorption corrections were performed numerically with SADABS (16). The structures were solved with the intrinsic phasing methods SHELXT (17). All structure were refined with the full-matrix least-squares SHELXL program. Analysis of the available data results in a chemically reasonable structure model that confirms the target molecule was synthesized.  Singles crystals of [N4A2]2 were grown by slow evaporation of a solution of [N4A2]2 in a mixture of chloroform and hexane. The crystals diffracted very weakly due to the lack of heavy atoms and disordered solvents within the cavity. Intensity data were collected on a Bruker D8 Venture kappa diffractometer equipped with a Photon 100 CMOS detector. An Iµs microfocus source provided the Cu Kα radiation (λ = 1.54178 Å) that was monochromated with multilayer mirrors. The collection, cell refinement and integration of intensity data was carried out with the APEX2 software (15). Face-indexed absorption corrections were performed numerically with SADABS (16). The structures were solved with the intrinsic phasing methods SHELXT (17). All structure were refined with the full-matrix least-squares SHELXL program. Analysis of the available data results in a chemically reasonable structure model that confirms the target molecule was synthesized. Singles crystals of MC5 were grown by slow evaporation of a solution of MC5 in a mixture of chloroform and hexane. The crystals diffracted very weakly due to the lack of heavy atoms and disordered solvents within the cavity. Intensity data were collected on a Bruker D8 Venture kappa diffractometer equipped with a Photon 100 CMOS detector. An Iµs microfocus source provided the Cu Kα radiation (λ = 1.54178 Å) that was monochromated with multilayer mirrors. The collection, cell refinement and integration of intensity data was carried out with the APEX2 software (15). Face-indexed absorption corrections were performed numerically with SADABS (16). The structures were solved with the intrinsic phasing methods SHELXT (17). All structure were refined with the full-matrix least-squares SHELXL program. Analysis of the available data results in a chemically reasonable structure model that confirms the target molecule was synthesized.