Combined crystal structure prediction and high-pressure crystallization in rational pharmaceutical polymorph screening

Organic molecules, such as pharmaceuticals, agro-chemicals and pigments, frequently form several crystal polymorphs with different physicochemical properties. Finding polymorphs has long been a purely experimental game of trial-and-error. Here we utilize in silico polymorph screening in combination with rationally planned crystallization experiments to study the polymorphism of the pharmaceutical compound Dalcetrapib, with 10 torsional degrees of freedom one of the most flexible molecules ever studied computationally. The experimental crystal polymorphs are found at the bottom of the calculated lattice energy landscape, and two predicted structures are identified as candidates for a missing, thermodynamically more stable polymorph. Pressure-dependent stability calculations suggested high pressure as a means to bring these polymorphs into existence. Subsequently, one of them could indeed be crystallized in the 0.02 to 0.50 GPa pressure range and was found to be metastable at ambient pressure, effectively derisking the appearance of a more stable polymorph during late-stage development of Dalcetrapib.


Supplementary
(3) 11.181(15) 10.697 (2) b/Å 10.4524(9) 9.8526(2) 9.809(13) 9.840 (2) c/Å 9.7642 Our aim is to predict disorder that is an equilibrium property of a crystal form and contributes to the crystal free energy. For this type of disorder we will use the term thermodynamic disorder.
Crystallographers make a distinction between dynamic versus static disorder, according to whether atoms move on the timescale of a diffraction experiment or not. For samples in thermodynamic equilibrium, dynamic disorder actually is thermodynamic disorder. Static disorder typically is not thermodynamic disorder and may originate from stacking faults or unfavorable molecular conformations build into the crystal during the growth process. However, energy landscapes with high energy-barriers but energetically close energy minima are conceivable for which static disorder would actually be thermodynamic disorder.
Thermodynamic disorder occurs when local atomic rearrangements are possible at an energetic cost of not much more than kT, where k is the Boltzmann constant and T is the temperature.
Because a local rearrangement applied to all molecules in the unit cell of an ordered structural model leads to another ordered structural model, the presence of families of similar structures among the first few predicted structures with energy differences less than 3kT ≈7.5 kJ mol -1 (room temperature) per molecule must be considered as a strong indication for a possibly disordered crystal structure. However, the presence of a family is not a necessary condition, since a local atomic rearrangement may be energetically feasible for a single molecule surrounded by molecules in the original configuration, but not for all molecules at once. It is also not a sufficient condition, because the energetic cost for the change of a molecular conformation may be low for a concerted rearrangement of a one-, two-or three-dimensional net of molecules, but not for a single molecule. A pragmatic approach is to search for thermodynamic disorder whenever a family of similar and energetically close structures is found.
To find the configurations that contribute to a disordered crystal structure, two different strategies have been tried for the predicted structures 1, 2 and 3. The first approach, called the substitution method, is to change the conformation of one molecule in a 1×1×1 super cell to another  Assuming that all molecules in a crystal are disordered independently, the thermal occupancies of the configurations and their enthalpy and free energy contributions are readily evaluated using textbook formulas 7 .
The partition function is defined as: Here i refers to the structure index and j refers to the configuration index. ΔE i,j is the energy difference with respect to configuration 1 that can be taken from Supplementary Table 2.
Occupancies can be obtained as follows: The free energy is related to the partition function: Here the lattice energy of the ordered state, E i , has been added explicitly. It can be found in Supplementary Table 1.
At ambient pressure, the enthalpy is practically identical to the average energy: Using the above formulas and the energy values from Supplementary Tables 1 and 2, the   occupancies in Supplementary Table 2 and the free energies curves for the disorder models of structures 1, 2 and 3 in Supplementary Fig. 2 have been obtained.
In terms of free energy, the disordered structure 3 is more stable than the disordered structure 2 at room temperature, which is in agreement with the fact that the disorder model of structure 3 matches the experimentally observed form C very well. Structure 1 is only slightly disordered.
The most favorable atomic rearrangement (configuration 2) has a thermal occupation of less than 1.3% below the temperature of the form Aform B phase transition at -87 °C. Below the phasetransition temperature, structure 1 is more stable than structures 2 and 3 even if the configurational free energy contribution is taken into account.
The nature of the disorder in form A has turned out to be difficult to understand. The experimental phase transition enthalpy is about 3.3 kJ mol -1 . At the phase-transition temperature, the free energy difference is zero, such that the entropy difference can be derived from the phase transition enthalpy: Relating the entropy to a number of states, it follows that per molecule form A has 8 times more states than form B. It turned out that none of the predicted structures of the structure-1 family could accommodate a substantial amount of disorder. Since the crystal structure prediction study had been carried out for one molecule per asymmetric unit only, an attempt was made to find structures with two molecules per asymmetric unit also belonging to the structure-1 family that may be able to accommodate the required amount of disorder. In a first step, the Monte Carlo disorder search tool was applied to structure 1, not limiting the regions of structural change to zero-dimensional islands. The space-group symmetry of the obtained disordered models was analyzed, and for structures with two molecules per asymmetric unit the Monte Carlo disorder search tool was applied again, this time limiting the regions of change to zero-dimensional islands. Only one structural model with a substantial amount of disorder could be identified this way. Its configurations are listed in Supplementary Table 3

CPU time consumption
The crystal structure prediction study, including the generation of a tailor-made force field, required 16.5 weeks of CPU time on a LUNIX cluster featuring 64 quad core AMS Opteron processors with 2 GB RAM per core and connected by an InfiniPath network. The disorder analysis required 6 additional weeks of CPU time on the same cluster.

Structural models
All predicted structures and disorder models mentioned in the main text or the supplementary information are supplied as Supplementary Data (see Supplementary Note 3).

Supplementary Note 2 Experimental procedures
Form C

High-pressure crystallization and crystal growth
The compound was used as received.

High-pressure crystallography
In situ single-crystal X-ray diffraction of form C at 0.02 GPa Single-crystal X-ray diffraction data were collected at ambient temperature on a Bruker Apex II diffractometer equipped with Mo sealed tube radiation. A standard data collection strategy that aims at optimizing data redundancy and completeness was used. To further increase completeness, two data sets with the DAC in two different orientations were collected. The scattering fall off at a resolution greater than 1.0 Å gave a first good indication that the structure is disordered. For data reduction, a standard strategy was used 10 .
The structure was solved by direct methods using the program SIR92 11 . Full-matrix least-squares structure refinement against F 2 was performed using the program SHELXL 12 (Version 2013/3) through the GUI interface SHELXLE 13 .
Refinement was carried out using two unit-cell settings: has a significant effect on the R-factor, which increases to a value of over 7% when disorder is not taken into account. Restraints on 1,2 and 1,3 bond distances were used to ensure a chemically meaningful geometry; Anisotropic displacement parameters (ADPs) were also refined subject to enhanced rigid-body restraints (RIGU) 14 ; the disordered ethyl groups were refined isotropically and for some atoms positions and U iso constraints were used to enforce a chemically-reasonable model. All H-atoms were refined with a riding model. A final R-factor of 5.3 % is extremely satisfactory for this structure and the given data completeness and resolution. Further crystallographic details are given in Supplementary Table 4 and Supplementary Data 1.

Comparison of experimental and computational structures
From comparison of the unit-cell constants, molecular geometry and crystal packing, the experimentally found form C high-pressure structure shows an excellent correspondence with structure 3 from the computational study. An overlay of the two structures is shown in Supplementary Fig. 5. Although in theory the results of the calculations could be used to build more accurate refinement models, in practice, the limited amount of data that can be collected with the DAC precludes this without running into overparameterization and severe parameter correlation problems. It should be noted that in the case of the 3-pentyl group, the experimental model is actually an average of the computed configurations 2 and 5 (Supplementary Fig. 6a and 6b).

Recovery of a single-crystal sample
In order to investigate whether the single-crystal sample of form C crystallized at high pressure is also stable at ambient pressure, the pressure inside the DAC was slowly released from 0.02 GPa to ambient pressure and the crystal inspected visually under an optical microscope. The crystal was observed to dissolve but remained edged in between the gasket once the cell was opened.
The gasket was quickly mounted on the single-crystal diffractometer at ambient pressure. No single-crystal diffraction was observed from the sample but a diffraction shot was recorded at fixed sample position ( Supplementary Fig. 7). The quality of the diffraction pattern is very poor, no background was subtracted and the pattern can be at best used for semi-qualitative phase

Recovery of a polycrystalline sample
In order to better investigate the stability of form C at ambient pressure, the recovery of polycrystalline material crystallized in situ at high pressure was investigated. A 0.82 M THF solution was used for this experiment and crystals were obtained at 0.5 GPa. In situ diffraction experiments were performed allowing for some sample rotation during exposure to X-rays to improve the counting statistics and powder averaging. DAC/gasket background subtraction was also performed. The resulting powder pattern, shown in Supplementary Fig. 9, can be clearly attributed to form C. The cell was subsequently cooled for two hours at 298 K and for 30 minutes to 253 K. Once cooled, the cell was rapidly opened, the very modest amount of excess solution on the gasket dabbed with a cotton fiber and the gasket containing polycrystalline material mounted on the diffractometer. No appreciable dissolution was observed on decompression.
From the comparison of the powder patterns, it is clear that the decompressed sample corresponds to form A, suggesting that a solution-mediated transition took place on decompression and confirming the single-crystal recovery results.  Fig. 10).

Crystallization of form C at high pressure from the melt
The pressure inside the DAC was subsequently decreased to ambient but the cell was not opened to avoid possible contamination with crystallites of form A crystallized outside the sample chamber. The cell was transferred to the diffractometer; diffraction data clearly indicate the presence of both forms A and C ( Supplementary Fig. 11, top). Data were collected approximately 45 minutes after the pressure was released to ambient. Two further powder patterns were collected at two hour intervals ( Supplementary Fig. 11, middle and bottom). After two hours the sample had almost completely transformed to form A; after a further two hours the sample appeared to have completely transformed.
The results of these experiments indicate that whilst form C is recoverable to ambient pressure and temperature, this form is metastable and form A is the most thermodynamically stable form under these conditions.

Form A at 295 K
A single-crystal specimen was selected from a batch of form A that was grown from a 0.82 M THF solution at 253 K. Diffraction data were collected at ambient temperature on a Bruker Apex II diffractometer equipped with Mo sealed tube radiation. A standard data collection strategy that aims at optimizing data redundancy and completeness was used.
Both propyl and 3-pentyl side chains are disordered; significant displacement is also observed for the carbonyl oxygen, indicating motion for the whole corresponding side chain. Several structural models were tested for refinement and the one that best describes disorder without overparameterization is reported here. Disordered sites were fixed to half occupancy; when freely refined, occupancies for the 3-pentyl side chain are 0.594(6):0.406(6).
Disorder was not modeled for the propyl side chain: the preference for using an anisotropic model over an isotropic model with fewer parameters to describe large thermal motion has been discussed by Watkin 15 . As in the case of form C at high pressure, modeling of disorder has a significant effect on the R-factor. Restraints on 1,2 and 1,3 bond distances were used to ensure a chemically meaningful geometry; ADPs were refined for all non-H atoms, some atoms were further subject to enhanced rigid-body restraints (RIGU) 14 ; U iso and position constraints for atoms occupying the same site were used to enforce a chemically-reasonable model. All H-atoms were refined with a riding model. A final R-factor of 5.3 % (5.17 % with disorder fixed to 60:40) is satisfactory for this structure. Further crystallographic details are given in Supplementary Table 4 and Supplementary Data 2.

Form A at 223 K
In order to investigate the nature of disorder, i.e. static or dynamic, a multi-temperature experiment would ideally be carried out 16 . In the particular case of Form A, since the phase transition to the low-temperature form B takes place at relatively high low temperatures, the temperature window available for such an experiment is modest. Cooling of form A proved to be a challenging task: reflection broadening and crystal splitting occurred, irrespective of cooling rate, with the number of crystalline domains increasing as a function of increasing time, ultimately leading to a complete crystal destruction. Splitting was observed for crystals cooled in the 223-203 K temperature range. One of the challenges associated with data collection and processing is that as the crystal breaks up, it is no longer centered on the diffractometer, making data integration a non-straightforward task. After several attempts, a partial data collection lasting 7 hours was successfully performed on a specimen at 223 K.
A similar refinement model used for form A at 295 K was employed with the difference that, given the lower data resolution, the use of RIGU restraints was extended to all atoms. Compared  Possible reasons for the observed crystal cracking on cooling may be ascribed to the microstructure of form A, e.g. the presence of defects, the crystal's stress/strain gradients and its response to thermal stress. These effects are well known in crystals of inorganic compounds and proteins. The effects of temperature annealing were not investigated.

Disorder in form A
The atomic coordinates of the disordered high temperature form A are compatible with a mixture of the predicted structure 1 and structure 22. As far as the crystallographic data can tell, crystals grown at ambient temperature and at 253 K, and measured at room temperature, all show the same type and amount of disorder. This would indicate that the conformational disorder in form A is to a first approximation independent of the crystal growth conditions. The transition from From the available X-ray data, it was not possible to distinguish between sheets or microdomains or other types of disorder other than the one modeled. However, based on the available data, the possibility that form A corresponds to a mixture of two distinct polymorphs, echoing the case of aspirin, which was described as "intergrowths of two "polymorphic" domains" 17  Noted, no action taken. These distances are not exceedingly short. Some of these atoms are disordered; it is possible that this distance is an artifact of refinement. H-atoms were placed in calculated positions.

Form B
THETM01_ALERT_3_A The value of sine(theta_max)/wavelength is less than 0.550 Calculated sin(theta_max)/wavelength = 0.5498 The IUCr sin(theta)/lambda requirement of 0.6 has actually been reached. Copper radiation used.