Molecular basis of regio- and stereo-specificity in biosynthesis of bacterial heterodimeric diketopiperazines

Bacterial heterodimeric tryptophan-containing diketopiperazines (HTDKPs) are a growing family of bioactive natural products. They are challenging to prepare by chemical routes due to the polycyclic and densely functionalized backbone. Through functional characterization and investigation, we herein identify a family of three related HTDKP-forming cytochrome P450s (NasbB, NasS1868 and NasF5053) and reveal four critical residues (Qln65, Ala86, Ser284 and Val288) that control their regio- and stereo-selectivity to generate diverse dimeric DKP frameworks. Engineering these residues can alter the specificities of the enzymes to produce diverse frameworks. Determining the crystal structures (1.70–1.47 Å) of NasF5053 (ligand-free and substrate-bound NasF5053 and its Q65I-A86G and S284A-V288A mutants) and molecular dynamics simulation finally elucidate the specificity-conferring mechanism of these residues. Our results provide a clear molecular and mechanistic basis into this family of HTDKP-forming P450s, laying a solid foundation for rapid access to the molecular diversity of HTDKP frameworks through rational engineering of the P450s.


General materials and methods 317
Escherichia coli and Mycobacterium strains were cultivated and manipulated 318 according to standard methods 2,4 . Mycobacterium smegmatis mc 2 155 was cultivated 319 either on Luria-Bertani (LB) media agar plates or in Lemoco liquid media (5 g peptone, 320 5 g beef extract, 5 g NaCl, 0.1 % Tween 80 in 1 L tap water). Strains and plasmids 321 used in this study are listed in Supplementary Table 6

P450 enzyme assays 450
The activities of wild-type and mutant P450s were assayed in HEPES buffer (50 451 mM HEPES, 100 mM NaCl, pH 7.5) containing 0.1 µM purified P450s, 1 mM cWL-PL, 452 1 µM spinach ferredoxin (Fd), 1 µM spinach ferredoxin reductase (FdR), 2 mM NADP + , 453 2 mM glucose and 2 mM glucose dehydrogenase (GDH). Expression and purification 454 of Fd, FdR, GDH were described previously 3 . The reaction was incubated at 4 °C. After 455 24 h, two times the volume of ethyl acetate was added to quench the reaction, followed 456 by sonication for 5 minutes. After the separation of aqueous and organic phases, the 457 ethyl acetate phase was transferred to a rotavapor to dry, which was re-dissolved in

Scaled biocatalytic reaction and purification of NAS-B, ASP-A and NAS-E 466
Based on the whole-cell catalysis system we developed previously 3 , genes of 467 NasF5053 or its mutants, together with the plasmid pWHU2487 expressing Fd and FdR, 468 were co-transferred into E. coli GB05-dir-T7. The resulting bacteria were inoculated in 469 LB media (5 L) and grown to an OD600 of 0.8-1.0 at 37 °C. After this, the cells were  in Supplementary Fig. 1-3 Overlap-PCR was used to introduce mutations on multiple residues simultaneously. 499 Taking fragment-7 of NascB as an example, two pairs of primers, nascB-F/nascB-7-R 500 and nascB-7-F/nascB-R, were used to prepare two NDA fragments, respectively, 501 based on the pWHU2485 DNA template. After purification by gel extraction kit, these 502 two fragments were used as a template to amplify the whole nascB gene, by using 503 nascB-F and nascB-R as primers. This PCR product was purified with the DNA gel 504 extraction kit, and then cloned into the pET28a vector by the Gibson assembly kit. The 505 plasmids were then isolated and sequenced to verify the desired mutations. 506 507

Construction and screening of NasF5053 mutant libraries 508
The 2-step PCR method developed by Reetz et al. 5 was applied to construct the 509 NasF5053 library. With the pET21a-NasF5053 (wild-type) plasmid as the template, primers 510 F5053-SM-NNK-F and F5053-SM-NNK-R were used to amplify the megaprimers. After 511 all the megaprimers were confirmed by DNA agarose electrophoresis, they were used 512 directly to amplify the whole plasmid. Templates were removed by Dpn I digestion at 513 37 °C for 7 h, which was confirmed by electrophoresis. PCR amplicons (2 μL) were 514 directly transformed into 100 μL electrocompetent E. coli GB05-dir-T7, which contained 515 the plasmid pWHU2487 expressing Fd and FdR. After adding 900 μL LB media, the 516 cells were recovered at 37 °C for 1 h and then spread onto agar plates containing 517 kanamycin (50 μg mL -1 ), ampicillin (100 μg mL -1 ) and apramycin (50 μg mL -1 ). 518 After incubating for 14 h at 37 °C, 400 individual colonies were picked from the 519 plate and inoculated into 500 μL of LB in a 2 mL 96-well plate. This plate was grown 520 for 12-16 h at 37 °C and 220 rpm. 100 μL portions of each culture were transferred to 521 a new 0.5 mL 96-well plate containing 100 μL of sterile glycerol (40 %, v/v) for stock. 522 The rest of the bacteria were supplemented with 100 μmol IPTG, 400 μmol ALA and 523 200 μmol (NH4)2Fe(SO4)2.6H2O and continued to be expressed at 18 °C and 220 rpm 524 for 20 h. The cells were harvested by centrifugation at 3,000 rpm at 4 °C and 525 S43 resuspended in 400 μL M9 media. Then, 1 μL of cWL-PL (100 mM) was added to the 526 M9 media. After 48 h incubation at 18 °C, the reaction mixture was extracted with 1 mL 527 ethyl acetate three times. The organic phase was transferred and dried by vacuum at 528 a low temperature. Metabolites were subsequently redissolved by methanol and 529 filtrated by a 0.45 μm membrane to remove particles. The activities of each mutants 530 were analyzed by UPLC-MS, using the same condition mentioned above.

NasF5053 and its mutants 574
The first 5 amino acids (MTTTA) of NasF5053 were confirmed to have no effect on 575 enzyme activity and were thus removed as presumably flexible, for potential benefit in 576 crystallization. With the first 5 amino acids removed, the NasF5053 gene was amplified 577 with the primer pair F5053-Xray-For/F5053-Xray-Rev and cloned into pSrtA9 through 578 ligation-independent cloning 7 . Using New England Biolabs Q5 site-directed 579 mutagenesis kit, the NasF5053-Q65I-A86G mutant was prepared through two 580 successive rounds of single mutagenesis (the first primer pair: F5053-Q65I-For/